-
Evaluating amyloid-beta as a surrogate endpoint in trials of anti-amyloid drugs in Alzheimer's disease: a Bayesian meta-analysis
Authors:
Sa Ren,
Janharpreet Singh,
Sandro Gsteiger,
Christopher Cogley,
Ben Reed,
Keith R Abrams,
Dalia Dawoud,
Rhiannon K Owen,
Paul Tappenden,
Terrence J Quinn,
Sylwia Bujkiewicz
Abstract:
The use of amyloid-beta (A$β$) clearance to support regulatory approvals of drugs in Alzheimer's disease (AD) remains controversial. We evaluate A$β$ as a potential trial-level surrogate endpoint for clinical function in AD using a meta-analysis. Randomised controlled trials (RCTs) reporting data on the effectiveness of anti- A$β$ monoclonal antibodies (MABs) on A$β$ and clinical outcomes were ide…
▽ More
The use of amyloid-beta (A$β$) clearance to support regulatory approvals of drugs in Alzheimer's disease (AD) remains controversial. We evaluate A$β$ as a potential trial-level surrogate endpoint for clinical function in AD using a meta-analysis. Randomised controlled trials (RCTs) reporting data on the effectiveness of anti- A$β$ monoclonal antibodies (MABs) on A$β$ and clinical outcomes were identified through a literature review. A Bayesian bivariate meta-analysis was used to evaluate surrogate relationships between the treatment effects on A$β$ and clinical function, with the intercept, slope and variance quantifying the trial level association. The analysis was performed using RCT data both collectively across all MABs and separately for each MAB through subgroup analysis. The latter analysis was extended by applying Bayesian hierarchical models to borrow information across treatments. We identified 23 RCTs with 39 treatment contrasts for seven MABs. The association between treatment effects on A$β$ and Clinical Dementia Rating - Sum of Boxes (CDR-SOB) across all MABs was strong: with intercept of -0.03 (95% credible intervals: -0.16, 0.11), slope of 1.41 (0.60, 2.21) and variance of 0.02 (0.00, 0.05). For individual treatments, the surrogate relationships were suboptimal, displaying large uncertainty. The use of hierarchical models considerably reduced the uncertainty around key parameters, narrowing the intervals for the slopes by an average of 71% (range: 51%-95%) and for the variances by 28% (7%-65%). Our results suggest that A$β$ is a potential surrogate endpoint for CDR-SOB when assuming a common surrogate relationship across all MABs. When allowing for information-sharing, the surrogate relationships improved, but only for lecanemab and aducanumab was the improvement sufficient to support a surrogate relationship.
△ Less
Submitted 9 April, 2025;
originally announced April 2025.
-
Bi-Criteria Optimization for Combinatorial Bandits: Sublinear Regret and Constraint Violation under Bandit Feedback
Authors:
Vaneet Aggarwal,
Shweta Jain,
Subham Pokhriyal,
Christopher John Quinn
Abstract:
In this paper, we study bi-criteria optimization for combinatorial multi-armed bandits (CMAB) with bandit feedback. We propose a general framework that transforms discrete bi-criteria offline approximation algorithms into online algorithms with sublinear regret and cumulative constraint violation (CCV) guarantees. Our framework requires the offline algorithm to provide an $(α, β)$-bi-criteria appr…
▽ More
In this paper, we study bi-criteria optimization for combinatorial multi-armed bandits (CMAB) with bandit feedback. We propose a general framework that transforms discrete bi-criteria offline approximation algorithms into online algorithms with sublinear regret and cumulative constraint violation (CCV) guarantees. Our framework requires the offline algorithm to provide an $(α, β)$-bi-criteria approximation ratio with $δ$-resilience and utilize $\texttt{N}$ oracle calls to evaluate the objective and constraint functions. We prove that the proposed framework achieves sub-linear regret and CCV, with both bounds scaling as ${O}\left(δ^{2/3} \texttt{N}^{1/3}T^{2/3}\log^{1/3}(T)\right)$. Crucially, the framework treats the offline algorithm with $δ$-resilience as a black box, enabling flexible integration of existing approximation algorithms into the CMAB setting. To demonstrate its versatility, we apply our framework to several combinatorial problems, including submodular cover, submodular cost covering, and fair submodular maximization. These applications highlight the framework's broad utility in adapting offline guarantees to online bi-criteria optimization under bandit feedback.
△ Less
Submitted 15 March, 2025;
originally announced March 2025.
-
Stochastic $k$-Submodular Bandits with Full Bandit Feedback
Authors:
Guanyu Nie,
Vaneet Aggarwal,
Christopher John Quinn
Abstract:
In this paper, we present the first sublinear $α$-regret bounds for online $k$-submodular optimization problems with full-bandit feedback, where $α$ is a corresponding offline approximation ratio. Specifically, we propose online algorithms for multiple $k$-submodular stochastic combinatorial multi-armed bandit problems, including (i) monotone functions and individual size constraints, (ii) monoton…
▽ More
In this paper, we present the first sublinear $α$-regret bounds for online $k$-submodular optimization problems with full-bandit feedback, where $α$ is a corresponding offline approximation ratio. Specifically, we propose online algorithms for multiple $k$-submodular stochastic combinatorial multi-armed bandit problems, including (i) monotone functions and individual size constraints, (ii) monotone functions with matroid constraints, (iii) non-monotone functions with matroid constraints, (iv) non-monotone functions without constraints, and (v) monotone functions without constraints. We transform approximation algorithms for offline $k$-submodular maximization problems into online algorithms through the offline-to-online framework proposed by Nie et al. (2023a). A key contribution of our work is analyzing the robustness of the offline algorithms.
△ Less
Submitted 14 December, 2024;
originally announced December 2024.
-
Combinatorial Stochastic-Greedy Bandit
Authors:
Fares Fourati,
Christopher John Quinn,
Mohamed-Slim Alouini,
Vaneet Aggarwal
Abstract:
We propose a novel combinatorial stochastic-greedy bandit (SGB) algorithm for combinatorial multi-armed bandit problems when no extra information other than the joint reward of the selected set of $n$ arms at each time step $t\in [T]$ is observed. SGB adopts an optimized stochastic-explore-then-commit approach and is specifically designed for scenarios with a large set of base arms. Unlike existin…
▽ More
We propose a novel combinatorial stochastic-greedy bandit (SGB) algorithm for combinatorial multi-armed bandit problems when no extra information other than the joint reward of the selected set of $n$ arms at each time step $t\in [T]$ is observed. SGB adopts an optimized stochastic-explore-then-commit approach and is specifically designed for scenarios with a large set of base arms. Unlike existing methods that explore the entire set of unselected base arms during each selection step, our SGB algorithm samples only an optimized proportion of unselected arms and selects actions from this subset. We prove that our algorithm achieves a $(1-1/e)$-regret bound of $\mathcal{O}(n^{\frac{1}{3}} k^{\frac{2}{3}} T^{\frac{2}{3}} \log(T)^{\frac{2}{3}})$ for monotone stochastic submodular rewards, which outperforms the state-of-the-art in terms of the cardinality constraint $k$. Furthermore, we empirically evaluate the performance of our algorithm in the context of online constrained social influence maximization. Our results demonstrate that our proposed approach consistently outperforms the other algorithms, increasing the performance gap as $k$ grows.
△ Less
Submitted 13 December, 2023;
originally announced December 2023.
-
MetaBayesDTA: Codeless Bayesian meta-analysis of test accuracy, with or without a gold standard
Authors:
Enzo Cerullo,
Alex J. Sutton,
Hayley E. Jones,
Olivia Wu,
Terry J. Quinn,
Nicola J. Cooper
Abstract:
Introduction: Despite their applicability, statistical models used for the meta-analysis of test accuracy require specialised knowledge to implement, with the necessary level of expertise having recently increased. This is due to the development and recommendation to use more sophisticated methods; such as those in Version 2 of the Cochrane Handbook for Systematic Reviews of Diagnostic Test Accura…
▽ More
Introduction: Despite their applicability, statistical models used for the meta-analysis of test accuracy require specialised knowledge to implement, with the necessary level of expertise having recently increased. This is due to the development and recommendation to use more sophisticated methods; such as those in Version 2 of the Cochrane Handbook for Systematic Reviews of Diagnostic Test Accuracy. This paper describes a web-based application that extends the functionality of previous applications, making many advanced analysis methods more accessible.
Methods: We sought to create an extended, stand-alone, Bayesian version of MetaDTA, which (i) has the benefits of previously proposed applications and addresses key limitations of them, (ii) is accessible to researchers who do not have the specific expertise required to fit such models, and (iii) is suitable for experienced analysts. We created the application using Shiny and Stan.
Results: We created MetaBayesDTA (https://crsu.shinyapps.io/MetaBayesDTA/), which allows users to conduct meta-analysis of test accuracy, with or without a gold standard. The application addresses several key limitations of other applications. For instance, for the bivariate model, one can conduct subgroup analysis, univariate meta-regression, and comparative test accuracy evaluation. Meanwhile, for the model which does not assume a perfect gold standard, the application can account for the fact that studies use different reference tests.
Conclusions: Due to its user-friendliness and broad array of features, MetaBayesDTA should appeal to a wide variety of researchers. We anticipate that the application will encourage wider use of more advanced methods, which ultimately should improve the quality of test accuracy reviews.
△ Less
Submitted 15 November, 2022; v1 submitted 10 November, 2022;
originally announced November 2022.
-
Meta-analysis of dichotomous and ordinal tests without a gold standard
Authors:
Enzo Cerullo,
Hayley E. Jones,
Olivia Carter,
Terry J. Quinn,
Nicola J. Cooper,
Alex J. Sutton
Abstract:
Standard methods for the meta-analysis of medical tests without a gold standard are limited to dichotomous data. Multivariate probit models are used to analyze correlated binary data, and can be extended to multivariate ordered probit models to model ordinal data. Within the context of an imperfect gold standard, they have previously been used for the analysis of dichotomous and ordinal tests in a…
▽ More
Standard methods for the meta-analysis of medical tests without a gold standard are limited to dichotomous data. Multivariate probit models are used to analyze correlated binary data, and can be extended to multivariate ordered probit models to model ordinal data. Within the context of an imperfect gold standard, they have previously been used for the analysis of dichotomous and ordinal tests in a single study, and for the meta-analysis of dichotomous tests. In this paper, we developed a hierarchical, latent class multivariate probit model for the simultaneous meta-analysis of ordinal and dichotomous tests without assuming a gold standard. The model can accommodate a hierarchical partial pooling model on the conditional within-study correlations, enabling one to obtain summary estimates of joint test accuracy. Dichotomous tests use probit regression likelihoods and ordinal tests use ordered probit regression likelihoods. We fitted the models using Stan, which uses a state-of-the-art Hamiltonian Monte Carlo algorithm. We applied the models to a dataset in which studies evaluated the accuracy of tests, and test combinations, for deep vein thrombosis. We first demonstrated the issues with dichotomising test accuracy data a priori without a gold standard by fitting models which dichotomised the ordinal test data, and then we applied models which do not dichotomise the data. Furthermore, we fitted and compared a variety of other models, including those which assumed conditional independence and dependence between tests, and those assuming perfect and an imperfect gold standard.
△ Less
Submitted 26 April, 2022; v1 submitted 11 March, 2021;
originally announced March 2021.
-
DART: aDaptive Accept RejecT for non-linear top-K subset identification
Authors:
Mridul Agarwal,
Vaneet Aggarwal,
Christopher J. Quinn,
Abhishek Umrawal
Abstract:
We consider the bandit problem of selecting $K$ out of $N$ arms at each time step. The reward can be a non-linear function of the rewards of the selected individual arms. The direct use of a multi-armed bandit algorithm requires choosing among $\binom{N}{K}$ options, making the action space large. To simplify the problem, existing works on combinatorial bandits {typically} assume feedback as a lin…
▽ More
We consider the bandit problem of selecting $K$ out of $N$ arms at each time step. The reward can be a non-linear function of the rewards of the selected individual arms. The direct use of a multi-armed bandit algorithm requires choosing among $\binom{N}{K}$ options, making the action space large. To simplify the problem, existing works on combinatorial bandits {typically} assume feedback as a linear function of individual rewards. In this paper, we prove the lower bound for top-$K$ subset selection with bandit feedback with possibly correlated rewards. We present a novel algorithm for the combinatorial setting without using individual arm feedback or requiring linearity of the reward function. Additionally, our algorithm works on correlated rewards of individual arms. Our algorithm, aDaptive Accept RejecT (DART), sequentially finds good arms and eliminates bad arms based on confidence bounds. DART is computationally efficient and uses storage linear in $N$. Further, DART achieves a regret bound of $\tilde{\mathcal{O}}(K\sqrt{KNT})$ for a time horizon $T$, which matches the lower bound in bandit feedback up to a factor of $\sqrt{\log{2NT}}$. When applied to the problem of cross-selling optimization and maximizing the mean of individual rewards, the performance of the proposed algorithm surpasses that of state-of-the-art algorithms. We also show that DART significantly outperforms existing methods for both linear and non-linear joint reward environments.
△ Less
Submitted 15 November, 2020;
originally announced November 2020.
-
Optimal Mini-Batch Size Selection for Fast Gradient Descent
Authors:
Michael P. Perrone,
Haidar Khan,
Changhoan Kim,
Anastasios Kyrillidis,
Jerry Quinn,
Valentina Salapura
Abstract:
This paper presents a methodology for selecting the mini-batch size that minimizes Stochastic Gradient Descent (SGD) learning time for single and multiple learner problems. By decoupling algorithmic analysis issues from hardware and software implementation details, we reveal a robust empirical inverse law between mini-batch size and the average number of SGD updates required to converge to a speci…
▽ More
This paper presents a methodology for selecting the mini-batch size that minimizes Stochastic Gradient Descent (SGD) learning time for single and multiple learner problems. By decoupling algorithmic analysis issues from hardware and software implementation details, we reveal a robust empirical inverse law between mini-batch size and the average number of SGD updates required to converge to a specified error threshold. Combining this empirical inverse law with measured system performance, we create an accurate, closed-form model of average training time and show how this model can be used to identify quantifiable implications for both algorithmic and hardware aspects of machine learning. We demonstrate the inverse law empirically, on both image recognition (MNIST, CIFAR10 and CIFAR100) and machine translation (Europarl) tasks, and provide a theoretic justification via proving a novel bound on mini-batch SGD training.
△ Less
Submitted 14 November, 2019;
originally announced November 2019.
-
A High-Dimensional Particle Filter Algorithm
Authors:
Jameson Quinn
Abstract:
Online data assimilation in time series models over a large spatial extent is an important problem in both geosciences and robotics. Such models are intrinsically high-dimensional, rendering traditional particle filter algorithms ineffective. Though methods that begin to address this problem exist, they either rely on additional assumptions or lead to error that is spatially inhomogeneous. I prese…
▽ More
Online data assimilation in time series models over a large spatial extent is an important problem in both geosciences and robotics. Such models are intrinsically high-dimensional, rendering traditional particle filter algorithms ineffective. Though methods that begin to address this problem exist, they either rely on additional assumptions or lead to error that is spatially inhomogeneous. I present a novel particle-based algorithm for online approximation of the filtering problem on such models, using the fact that each locus affects only nearby loci at the next time step. The algorithm is based on a Metropolis-Hastings-like MCMC for creating hybrid particles at each step. I show simulation results that suggest the error of this algorithm is uniform in both space and time, with a lower bias, though higher variance, as compared to a previously-proposed algorithm.
△ Less
Submitted 29 January, 2019;
originally announced January 2019.
-
Stochastic Top-$K$ Subset Bandits with Linear Space and Non-Linear Feedback
Authors:
Mridul Agarwal,
Vaneet Aggarwal,
Christopher J. Quinn,
Abhishek K. Umrawal
Abstract:
Many real-world problems like Social Influence Maximization face the dilemma of choosing the best $K$ out of $N$ options at a given time instant. This setup can be modeled as a combinatorial bandit which chooses $K$ out of $N$ arms at each time, with an aim to achieve an efficient trade-off between exploration and exploitation. This is the first work for combinatorial bandits where the feedback re…
▽ More
Many real-world problems like Social Influence Maximization face the dilemma of choosing the best $K$ out of $N$ options at a given time instant. This setup can be modeled as a combinatorial bandit which chooses $K$ out of $N$ arms at each time, with an aim to achieve an efficient trade-off between exploration and exploitation. This is the first work for combinatorial bandits where the feedback received can be a non-linear function of the chosen $K$ arms. The direct use of multi-armed bandit requires choosing among $N$-choose-$K$ options making the state space large. In this paper, we present a novel algorithm which is computationally efficient and the storage is linear in $N$. The proposed algorithm is a divide-and-conquer based strategy, that we call CMAB-SM. Further, the proposed algorithm achieves a \textit{regret bound} of $\tilde O(K^{\frac{1}{2}}N^{\frac{1}{3}}T^{\frac{2}{3}})$ for a time horizon $T$, which is \textit{sub-linear} in all parameters $T$, $N$, and $K$. %When applied to the problem of Social Influence Maximization, the performance of the proposed algorithm surpasses the UCB algorithm and some more sophisticated domain-specific methods.
△ Less
Submitted 11 October, 2021; v1 submitted 28 November, 2018;
originally announced November 2018.
-
Feature exploration for almost zero-resource ASR-free keyword spotting using a multilingual bottleneck extractor and correspondence autoencoders
Authors:
Raghav Menon,
Herman Kamper,
Ewald van der Westhuizen,
John Quinn,
Thomas Niesler
Abstract:
We compare features for dynamic time warping (DTW) when used to bootstrap keyword spotting (KWS) in an almost zero-resource setting. Such quickly-deployable systems aim to support United Nations (UN) humanitarian relief efforts in parts of Africa with severely under-resourced languages. Our objective is to identify acoustic features that provide acceptable KWS performance in such environments. As…
▽ More
We compare features for dynamic time warping (DTW) when used to bootstrap keyword spotting (KWS) in an almost zero-resource setting. Such quickly-deployable systems aim to support United Nations (UN) humanitarian relief efforts in parts of Africa with severely under-resourced languages. Our objective is to identify acoustic features that provide acceptable KWS performance in such environments. As supervised resource, we restrict ourselves to a small, easily acquired and independently compiled set of isolated keywords. For feature extraction, a multilingual bottleneck feature (BNF) extractor, trained on well-resourced out-of-domain languages, is integrated with a correspondence autoencoder (CAE) trained on extremely sparse in-domain data. On their own, BNFs and CAE features are shown to achieve a more than 2% absolute performance improvement over baseline MFCCs. However, by using BNFs as input to the CAE, even better performance is achieved, with a more than 11% absolute improvement in ROC AUC over MFCCs and more than twice as many top-10 retrievals for two evaluated languages, English and Luganda. We conclude that integrating BNFs with the CAE allows both large out-of-domain and sparse in-domain resources to be exploited for improved ASR-free keyword spotting.
△ Less
Submitted 12 July, 2019; v1 submitted 14 November, 2018;
originally announced November 2018.
-
Automatic Speech Recognition for Humanitarian Applications in Somali
Authors:
Raghav Menon,
Astik Biswas,
Armin Saeb,
John Quinn,
Thomas Niesler
Abstract:
We present our first efforts in building an automatic speech recognition system for Somali, an under-resourced language, using 1.57 hrs of annotated speech for acoustic model training. The system is part of an ongoing effort by the United Nations (UN) to implement keyword spotting systems supporting humanitarian relief programmes in parts of Africa where languages are severely under-resourced. We…
▽ More
We present our first efforts in building an automatic speech recognition system for Somali, an under-resourced language, using 1.57 hrs of annotated speech for acoustic model training. The system is part of an ongoing effort by the United Nations (UN) to implement keyword spotting systems supporting humanitarian relief programmes in parts of Africa where languages are severely under-resourced. We evaluate several types of acoustic model, including recent neural architectures. Language model data augmentation using a combination of recurrent neural networks (RNN) and long short-term memory neural networks (LSTMs) as well as the perturbation of acoustic data are also considered. We find that both types of data augmentation are beneficial to performance, with our best system using a combination of convolutional neural networks (CNNs), time-delay neural networks (TDNNs) and bi-directional long short term memory (BLSTMs) to achieve a word error rate of 53.75%.
△ Less
Submitted 23 July, 2018;
originally announced July 2018.
-
ASR-free CNN-DTW keyword spotting using multilingual bottleneck features for almost zero-resource languages
Authors:
Raghav Menon,
Herman Kamper,
Emre Yilmaz,
John Quinn,
Thomas Niesler
Abstract:
We consider multilingual bottleneck features (BNFs) for nearly zero-resource keyword spotting. This forms part of a United Nations effort using keyword spotting to support humanitarian relief programmes in parts of Africa where languages are severely under-resourced. We use 1920 isolated keywords (40 types, 34 minutes) as exemplars for dynamic time warping (DTW) template matching, which is perform…
▽ More
We consider multilingual bottleneck features (BNFs) for nearly zero-resource keyword spotting. This forms part of a United Nations effort using keyword spotting to support humanitarian relief programmes in parts of Africa where languages are severely under-resourced. We use 1920 isolated keywords (40 types, 34 minutes) as exemplars for dynamic time warping (DTW) template matching, which is performed on a much larger body of untranscribed speech. These DTW costs are used as targets for a convolutional neural network (CNN) keyword spotter, giving a much faster system than direct DTW. Here we consider how available data from well-resourced languages can improve this CNN-DTW approach. We show that multilingual BNFs trained on ten languages improve the area under the ROC curve of a CNN-DTW system by 10.9% absolute relative to the MFCC baseline. By combining low-resource DTW-based supervision with information from well-resourced languages, CNN-DTW is a competitive option for low-resource keyword spotting.
△ Less
Submitted 23 July, 2018;
originally announced July 2018.
-
Direct Learning of Sparse Changes in Markov Networks by Density Ratio Estimation
Authors:
Song Liu,
John A. Quinn,
Michael U. Gutmann,
Taiji Suzuki,
Masashi Sugiyama
Abstract:
We propose a new method for detecting changes in Markov network structure between two sets of samples. Instead of naively fitting two Markov network models separately to the two data sets and figuring out their difference, we \emph{directly} learn the network structure change by estimating the ratio of Markov network models. This density-ratio formulation naturally allows us to introduce sparsity…
▽ More
We propose a new method for detecting changes in Markov network structure between two sets of samples. Instead of naively fitting two Markov network models separately to the two data sets and figuring out their difference, we \emph{directly} learn the network structure change by estimating the ratio of Markov network models. This density-ratio formulation naturally allows us to introduce sparsity in the network structure change, which highly contributes to enhancing interpretability. Furthermore, computation of the normalization term, which is a critical bottleneck of the naive approach, can be remarkably mitigated. We also give the dual formulation of the optimization problem, which further reduces the computation cost for large-scale Markov networks. Through experiments, we demonstrate the usefulness of our method.
△ Less
Submitted 1 January, 2014; v1 submitted 25 April, 2013;
originally announced April 2013.
-
Density Ratio Hidden Markov Models
Authors:
John A. Quinn,
Masashi Sugiyama
Abstract:
Hidden Markov models and their variants are the predominant sequential classification method in such domains as speech recognition, bioinformatics and natural language processing. Being generative rather than discriminative models, however, their classification performance is a drawback. In this paper we apply ideas from the field of density ratio estimation to bypass the difficult step of learnin…
▽ More
Hidden Markov models and their variants are the predominant sequential classification method in such domains as speech recognition, bioinformatics and natural language processing. Being generative rather than discriminative models, however, their classification performance is a drawback. In this paper we apply ideas from the field of density ratio estimation to bypass the difficult step of learning likelihood functions in HMMs. By reformulating inference and model fitting in terms of density ratios and applying a fast kernel-based estimation method, we show that it is possible to obtain a striking increase in discriminative performance while retaining the probabilistic qualities of the HMM. We demonstrate experimentally that this formulation makes more efficient use of training data than alternative approaches.
△ Less
Submitted 15 February, 2013;
originally announced February 2013.
-
Directed Information Graphs
Authors:
Christopher J. Quinn,
Negar Kiyavash,
Todd P. Coleman
Abstract:
We propose a graphical model for representing networks of stochastic processes, the minimal generative model graph. It is based on reduced factorizations of the joint distribution over time. We show that under appropriate conditions, it is unique and consistent with another type of graphical model, the directed information graph, which is based on a generalization of Granger causality. We demonstr…
▽ More
We propose a graphical model for representing networks of stochastic processes, the minimal generative model graph. It is based on reduced factorizations of the joint distribution over time. We show that under appropriate conditions, it is unique and consistent with another type of graphical model, the directed information graph, which is based on a generalization of Granger causality. We demonstrate how directed information quantifies Granger causality in a particular sequential prediction setting. We also develop efficient methods to estimate the topological structure from data that obviate estimating the joint statistics. One algorithm assumes upper-bounds on the degrees and uses the minimal dimension statistics necessary. In the event that the upper-bounds are not valid, the resulting graph is nonetheless an optimal approximation. Another algorithm uses near-minimal dimension statistics when no bounds are known but the distribution satisfies a certain criterion. Analogous to how structure learning algorithms for undirected graphical models use mutual information estimates, these algorithms use directed information estimates. We characterize the sample-complexity of two plug-in directed information estimators and obtain confidence intervals. For the setting when point estimates are unreliable, we propose an algorithm that uses confidence intervals to identify the best approximation that is robust to estimation error. Lastly, we demonstrate the effectiveness of the proposed algorithms through analysis of both synthetic data and real data from the Twitter network. In the latter case, we identify which news sources influence users in the network by merely analyzing tweet times.
△ Less
Submitted 11 March, 2015; v1 submitted 9 April, 2012;
originally announced April 2012.