Skip to main content

Showing 1–50 of 94 results for author: Khan, M

Searching in archive stat. Search in all archives.
.
  1. arXiv:2506.14280  [pdf, ps, other

    cs.LG cs.AI cs.CL stat.ML

    Improving LoRA with Variational Learning

    Authors: Bai Cong, Nico Daheim, Yuesong Shen, Rio Yokota, Mohammad Emtiyaz Khan, Thomas Möllenhoff

    Abstract: Bayesian methods have recently been used to improve LoRA finetuning and, although they improve calibration, their effect on other metrics (such as accuracy) is marginal and can sometimes even be detrimental. Moreover, Bayesian methods also increase computational overheads and require additional tricks for them to work well. Here, we fix these issues by using a recently proposed variational algorit… ▽ More

    Submitted 17 June, 2025; originally announced June 2025.

    Comments: 16 pages, 4 figures

  2. arXiv:2506.14262  [pdf, ps, other

    cs.LG cs.AI stat.ML

    Knowledge Adaptation as Posterior Correction

    Authors: Mohammad Emtiyaz Khan

    Abstract: Adaptation is the holy grail of intelligence, but even the best AI models (like GPT) lack the adaptivity of toddlers. So the question remains: how can machines adapt quickly? Despite a lot of progress on model adaptation to facilitate continual and federated learning, as well as model merging, editing, unlearning, etc., little is known about the mechanisms by which machines can naturally learn to… ▽ More

    Submitted 17 June, 2025; originally announced June 2025.

  3. arXiv:2506.13150  [pdf, ps, other

    cs.LG math.OC stat.ML

    Federated ADMM from Bayesian Duality

    Authors: Thomas Möllenhoff, Siddharth Swaroop, Finale Doshi-Velez, Mohammad Emtiyaz Khan

    Abstract: ADMM is a popular method for federated deep learning which originated in the 1970s and, even though many new variants of it have been proposed since then, its core algorithmic structure has remained unchanged. Here, we take a major departure from the old structure and present a fundamentally new way to derive and extend federated ADMM. We propose to use a structure called Bayesian Duality which ex… ▽ More

    Submitted 16 June, 2025; originally announced June 2025.

    Comments: Code is at https://github.com/team-approx-bayes/bayes-admm

  4. arXiv:2506.12903  [pdf, ps, other

    stat.ML cs.LG

    Variational Learning Finds Flatter Solutions at the Edge of Stability

    Authors: Avrajit Ghosh, Bai Cong, Rio Yokota, Saiprasad Ravishankar, Rongrong Wang, Molei Tao, Mohammad Emtiyaz Khan, Thomas Möllenhoff

    Abstract: Variational Learning (VL) has recently gained popularity for training deep neural networks and is competitive to standard learning methods. Part of its empirical success can be explained by theories such as PAC-Bayes bounds, minimum description length and marginal likelihood, but there are few tools to unravel the implicit regularization in play. Here, we analyze the implicit regularization of VL… ▽ More

    Submitted 15 June, 2025; originally announced June 2025.

  5. arXiv:2501.17325  [pdf, other

    cs.LG cs.AI stat.ML

    Connecting Federated ADMM to Bayes

    Authors: Siddharth Swaroop, Mohammad Emtiyaz Khan, Finale Doshi-Velez

    Abstract: We provide new connections between two distinct federated learning approaches based on (i) ADMM and (ii) Variational Bayes (VB), and propose new variants by combining their complementary strengths. Specifically, we show that the dual variables in ADMM naturally emerge through the 'site' parameters used in VB with isotropic Gaussian covariances. Using this, we derive two versions of ADMM from VB th… ▽ More

    Submitted 28 February, 2025; v1 submitted 28 January, 2025; originally announced January 2025.

  6. arXiv:2501.16988  [pdf, other

    stat.ML cs.LG

    Marginal and Conditional Importance Measures from Machine Learning Models and Their Relationship with Conditional Average Treatment Effect

    Authors: Mohammad Kaviul Anam Khan, Olli Saarela, Rafal Kustra

    Abstract: Interpreting black-box machine learning models is challenging due to their strong dependence on data and inherently non-parametric nature. This paper reintroduces the concept of importance through "Marginal Variable Importance Metric" (MVIM), a model-agnostic measure of predictor importance based on the true conditional expectation function. MVIM evaluates predictors' influence on continuous or di… ▽ More

    Submitted 28 January, 2025; v1 submitted 28 January, 2025; originally announced January 2025.

  7. arXiv:2501.04667  [pdf, other

    stat.ML cs.LG stat.CO

    Natural Variational Annealing for Multimodal Optimization

    Authors: Tâm Le Minh, Julyan Arbel, Thomas Möllenhoff, Mohammad Emtiyaz Khan, Florence Forbes

    Abstract: We introduce a new multimodal optimization approach called Natural Variational Annealing (NVA) that combines the strengths of three foundational concepts to simultaneously search for multiple global and local modes of black-box nonconvex objectives. First, it implements a simultaneous search by using variational posteriors, such as, mixtures of Gaussians. Second, it applies annealing to gradually… ▽ More

    Submitted 11 February, 2025; v1 submitted 8 January, 2025; originally announced January 2025.

  8. arXiv:2412.08147  [pdf, other

    cs.LG cs.AI stat.ML

    How to Weight Multitask Finetuning? Fast Previews via Bayesian Model-Merging

    Authors: Hugo Monzón Maldonado, Thomas Möllenhoff, Nico Daheim, Iryna Gurevych, Mohammad Emtiyaz Khan

    Abstract: When finetuning multiple tasks altogether, it is important to carefully weigh them to get a good performance, but searching for good weights can be difficult and costly. Here, we propose to aid the search with fast previews to quickly get a rough idea of different reweighting options. We use model merging to create previews by simply reusing and averaging parameters of models trained on each task… ▽ More

    Submitted 11 December, 2024; originally announced December 2024.

  9. arXiv:2411.04421  [pdf, other

    cs.LG cs.AI cs.CL stat.ML

    Variational Low-Rank Adaptation Using IVON

    Authors: Bai Cong, Nico Daheim, Yuesong Shen, Daniel Cremers, Rio Yokota, Mohammad Emtiyaz Khan, Thomas Möllenhoff

    Abstract: We show that variational learning can significantly improve the accuracy and calibration of Low-Rank Adaptation (LoRA) without a substantial increase in the cost. We replace AdamW by the Improved Variational Online Newton (IVON) algorithm to finetune large language models. For Llama-2 with 7 billion parameters, IVON improves the accuracy over AdamW by 2.8% and expected calibration error by 4.6%. T… ▽ More

    Submitted 9 November, 2024; v1 submitted 6 November, 2024; originally announced November 2024.

    Comments: Published at 38th Workshop on Fine-Tuning in Machine Learning (NeurIPS 2024). Code available at https://github.com/team-approx-bayes/ivon-lora. In version 2 we fixed a typo in the equation of prior in section 2

  10. arXiv:2404.08168  [pdf, other

    cs.LG stat.ML

    Conformal Prediction via Regression-as-Classification

    Authors: Etash Guha, Shlok Natarajan, Thomas Möllenhoff, Mohammad Emtiyaz Khan, Eugene Ndiaye

    Abstract: Conformal prediction (CP) for regression can be challenging, especially when the output distribution is heteroscedastic, multimodal, or skewed. Some of the issues can be addressed by estimating a distribution over the output, but in reality, such approaches can be sensitive to estimation error and yield unstable intervals.~Here, we circumvent the challenges by converting regression to a classifica… ▽ More

    Submitted 11 April, 2024; originally announced April 2024.

    Comments: International Conference of Learning Representations 2024

    Journal ref: International Conference of Learning Representations 2024

  11. arXiv:2402.17641  [pdf, other

    cs.LG cs.AI cs.CL math.OC stat.ML

    Variational Learning is Effective for Large Deep Networks

    Authors: Yuesong Shen, Nico Daheim, Bai Cong, Peter Nickl, Gian Maria Marconi, Clement Bazan, Rio Yokota, Iryna Gurevych, Daniel Cremers, Mohammad Emtiyaz Khan, Thomas Möllenhoff

    Abstract: We give extensive empirical evidence against the common belief that variational learning is ineffective for large neural networks. We show that an optimizer called Improved Variational Online Newton (IVON) consistently matches or outperforms Adam for training large networks such as GPT-2 and ResNets from scratch. IVON's computational costs are nearly identical to Adam but its predictive uncertaint… ▽ More

    Submitted 6 June, 2024; v1 submitted 27 February, 2024; originally announced February 2024.

    Comments: Published at International Conference on Machine Learning (ICML), 2024. The first two authors contributed equally. Code is available here: https://github.com/team-approx-bayes/ivon

  12. arXiv:2402.00809  [pdf, other

    cs.LG stat.ML

    Position: Bayesian Deep Learning is Needed in the Age of Large-Scale AI

    Authors: Theodore Papamarkou, Maria Skoularidou, Konstantina Palla, Laurence Aitchison, Julyan Arbel, David Dunson, Maurizio Filippone, Vincent Fortuin, Philipp Hennig, José Miguel Hernández-Lobato, Aliaksandr Hubin, Alexander Immer, Theofanis Karaletsos, Mohammad Emtiyaz Khan, Agustinus Kristiadi, Yingzhen Li, Stephan Mandt, Christopher Nemeth, Michael A. Osborne, Tim G. J. Rudner, David Rügamer, Yee Whye Teh, Max Welling, Andrew Gordon Wilson, Ruqi Zhang

    Abstract: In the current landscape of deep learning research, there is a predominant emphasis on achieving high predictive accuracy in supervised tasks involving large image and language datasets. However, a broader perspective reveals a multitude of overlooked metrics, tasks, and data types, such as uncertainty, active and continual learning, and scientific data, that demand attention. Bayesian deep learni… ▽ More

    Submitted 6 August, 2024; v1 submitted 1 February, 2024; originally announced February 2024.

    Comments: Proceedings of the 41st International Conference on Machine Learning, Vienna, Austria. PMLR 235, 2024

  13. arXiv:2401.06261  [pdf, other

    stat.ME q-bio.QM

    Prediction of causal genes at GWAS loci with pleiotropic gene regulatory effects using sets of correlated instrumental variables

    Authors: Mariyam Khan, Adriaan-Alexander Ludl, Sean Bankier, Johan Bjorkegren, Tom Michoel

    Abstract: Multivariate Mendelian randomization (MVMR) is a statistical technique that uses sets of genetic instruments to estimate the direct causal effects of multiple exposures on an outcome of interest. At genomic loci with pleiotropic gene regulatory effects, that is, loci where the same genetic variants are associated to multiple nearby genes, MVMR can potentially be used to predict candidate causal ge… ▽ More

    Submitted 20 September, 2024; v1 submitted 11 January, 2024; originally announced January 2024.

    Comments: Revised version, 31 pages, 5 figures. "TeX Source" contains file SI.pdf with Supplementary Information (26 pages, 9 figures). Code available at https://github.com/mariyam-khan/Causal_genes_GWAS_loci_CAD . Supporting data available at https://dataverse.no/dataset.xhtml?persistentId=doi:10.18710/VM0WKQ

  14. arXiv:2311.01669  [pdf

    stat.AP

    Motor vehicles accidents and teenage drivers: A statistical analysis of their age and injuries

    Authors: Debo Brata Paul Argha, Md Javed Imtiaze Khan

    Abstract: Motorcycle accidents are a prevalent problem in Texas, resulting in hundreds of injuries and deaths each year. Motorcycles provide the driver with little physical protection during accidents compared to cars and other vehicles, so when there is a collision involving a motorcycle, the motorcyclist is likely to be injured. While there are numerous reasons for motorcycle accidents, most are caused by… ▽ More

    Submitted 2 November, 2023; originally announced November 2023.

    Comments: 10 pages

  15. arXiv:2310.19273  [pdf, other

    cs.LG cs.AI stat.ML

    The Memory Perturbation Equation: Understanding Model's Sensitivity to Data

    Authors: Peter Nickl, Lu Xu, Dharmesh Tailor, Thomas Möllenhoff, Mohammad Emtiyaz Khan

    Abstract: Understanding model's sensitivity to its training data is crucial but can also be challenging and costly, especially during training. To simplify such issues, we present the Memory-Perturbation Equation (MPE) which relates model's sensitivity to perturbation in its training data. Derived using Bayesian principles, the MPE unifies existing sensitivity measures, generalizes them to a wide-variety of… ▽ More

    Submitted 16 January, 2024; v1 submitted 30 October, 2023; originally announced October 2023.

    Comments: 37th Conference on Neural Information Processing Systems (NeurIPS 2023)

  16. arXiv:2310.14348  [pdf, other

    cs.MA stat.ML

    DePAint: A Decentralized Safe Multi-Agent Reinforcement Learning Algorithm considering Peak and Average Constraints

    Authors: Raheeb Hassan, K. M. Shadman Wadith, Md. Mamun or Rashid, Md. Mosaddek Khan

    Abstract: The domain of safe multi-agent reinforcement learning (MARL), despite its potential applications in areas ranging from drone delivery and vehicle automation to the development of zero-energy communities, remains relatively unexplored. The primary challenge involves training agents to learn optimal policies that maximize rewards while adhering to stringent safety constraints, all without the oversi… ▽ More

    Submitted 3 April, 2024; v1 submitted 22 October, 2023; originally announced October 2023.

    Comments: accepted for publication in Springer Applied Intelligence Journal

  17. arXiv:2310.10553  [pdf, other

    cs.LG cs.MA stat.ML

    TacticAI: an AI assistant for football tactics

    Authors: Zhe Wang, Petar Veličković, Daniel Hennes, Nenad Tomašev, Laurel Prince, Michael Kaisers, Yoram Bachrach, Romuald Elie, Li Kevin Wenliang, Federico Piccinini, William Spearman, Ian Graham, Jerome Connor, Yi Yang, Adrià Recasens, Mina Khan, Nathalie Beauguerlange, Pablo Sprechmann, Pol Moreno, Nicolas Heess, Michael Bowling, Demis Hassabis, Karl Tuyls

    Abstract: Identifying key patterns of tactics implemented by rival teams, and developing effective responses, lies at the heart of modern football. However, doing so algorithmically remains an open research challenge. To address this unmet need, we propose TacticAI, an AI football tactics assistant developed and evaluated in close collaboration with domain experts from Liverpool FC. We focus on analysing co… ▽ More

    Submitted 17 October, 2023; v1 submitted 16 October, 2023; originally announced October 2023.

    Comments: 32 pages, 10 figures

  18. arXiv:2306.15169  [pdf, other

    cs.LG stat.ML

    Exploiting Inferential Structure in Neural Processes

    Authors: Dharmesh Tailor, Mohammad Emtiyaz Khan, Eric Nalisnick

    Abstract: Neural Processes (NPs) are appealing due to their ability to perform fast adaptation based on a context set. This set is encoded by a latent variable, which is often assumed to follow a simple distribution. However, in real-word settings, the context set may be drawn from richer distributions having multiple modes, heavy tails, etc. In this work, we provide a framework that allows NPs' latent vari… ▽ More

    Submitted 26 June, 2023; originally announced June 2023.

    Comments: Uncertainty in Artificial Intelligence (UAI) 2023

  19. arXiv:2306.03566  [pdf, other

    cs.LG stat.ML

    Memory-Based Dual Gaussian Processes for Sequential Learning

    Authors: Paul E. Chang, Prakhar Verma, S. T. John, Arno Solin, Mohammad Emtiyaz Khan

    Abstract: Sequential learning with Gaussian processes (GPs) is challenging when access to past data is limited, for example, in continual and active learning. In such cases, errors can accumulate over time due to inaccuracies in the posterior, hyperparameters, and inducing points, making accurate learning challenging. Here, we present a method to keep all such errors in check using the recently proposed dua… ▽ More

    Submitted 6 June, 2023; originally announced June 2023.

    Comments: International Conference on Machine Learning (ICML) 2023

  20. arXiv:2304.14251  [pdf, ps, other

    cs.LG cs.AI stat.ML

    Variational Bayes Made Easy

    Authors: Mohammad Emtiyaz Khan

    Abstract: Variational Bayes is a popular method for approximate inference but its derivation can be cumbersome. To simplify the process, we give a 3-step recipe to identify the posterior form by explicitly looking for linearity with respect to expectations of well-known distributions. We can then directly write the update by simply ``reading-off'' the terms in front of those expectations. The recipe makes t… ▽ More

    Submitted 10 July, 2023; v1 submitted 27 April, 2023; originally announced April 2023.

    Journal ref: Presented at the 5th Symposium on Advances in Approximate Bayesian Inference (AABI 2023)

  21. arXiv:2303.12210  [pdf, ps, other

    stat.ML cs.LG

    A Random Projection k Nearest Neighbours Ensemble for Classification via Extended Neighbourhood Rule

    Authors: Amjad Ali, Muhammad Hamraz, Dost Muhammad Khan, Wajdan Deebani, Zardad Khan

    Abstract: Ensembles based on k nearest neighbours (kNN) combine a large number of base learners, each constructed on a sample taken from a given training data. Typical kNN based ensembles determine the k closest observations in the training data bounded to a test sample point by a spherical region to predict its class. In this paper, a novel random projection extended neighbourhood rule (RPExNRule) ensemble… ▽ More

    Submitted 21 March, 2023; originally announced March 2023.

    Comments: 23 pages, 8 diagrams, 69 references

    ACM Class: F.2.2

  22. arXiv:2303.04397  [pdf, other

    cs.LG stat.ML

    The Lie-Group Bayesian Learning Rule

    Authors: Eren Mehmet Kıral, Thomas Möllenhoff, Mohammad Emtiyaz Khan

    Abstract: The Bayesian Learning Rule provides a framework for generic algorithm design but can be difficult to use for three reasons. First, it requires a specific parameterization of exponential family. Second, it uses gradients which can be difficult to compute. Third, its update may not always stay on the manifold. We address these difficulties by proposing an extension based on Lie-groups where posterio… ▽ More

    Submitted 8 March, 2023; originally announced March 2023.

    Comments: AISTATS 2023

  23. arXiv:2303.01954  [pdf, other

    stat.ML cs.AI cs.LG

    Synthetic Data Generator for Adaptive Interventions in Global Health

    Authors: Aditya Rastogi, Juan Francisco Garamendi, Ana Fernández del Río, Anna Guitart, Moiz Hassan Khan, Dexian Tang, África Periáñez

    Abstract: Artificial Intelligence and digital health have the potential to transform global health. However, having access to representative data to test and validate algorithms in realistic production environments is essential. We introduce HealthSyn, an open-source synthetic data generator of user behavior for testing reinforcement learning algorithms in the context of mobile health interventions. The gen… ▽ More

    Submitted 27 April, 2023; v1 submitted 3 March, 2023; originally announced March 2023.

  24. arXiv:2302.09738  [pdf, other

    stat.ML cs.LG

    Simplifying Momentum-based Positive-definite Submanifold Optimization with Applications to Deep Learning

    Authors: Wu Lin, Valentin Duruisseaux, Melvin Leok, Frank Nielsen, Mohammad Emtiyaz Khan, Mark Schmidt

    Abstract: Riemannian submanifold optimization with momentum is computationally challenging because, to ensure that the iterates remain on the submanifold, we often need to solve difficult differential equations. Here, we simplify such difficulties for a class of sparse or structured symmetric positive-definite matrices with the affine-invariant metric. We do so by proposing a generalized version of the Riem… ▽ More

    Submitted 16 March, 2024; v1 submitted 19 February, 2023; originally announced February 2023.

    Comments: A long version of the ICML 2023 paper. Updated the main text to emphasize challenges of using existing Riemannian methods to estimate sparse and structured SPD matrices

  25. arXiv:2212.09931  [pdf, other

    stat.CO stat.ML

    A Generalized Variable Importance Metric and Estimator for Black Box Machine Learning Models

    Authors: Mohammad Kaviul Anam Khan, Olli Saarela, Rafal Kustra

    Abstract: In this paper we define a population parameter, ``Generalized Variable Importance Metric (GVIM)'', to measure importance of predictors for black box machine learning methods, where the importance is not represented by model-based parameter. GVIM is defined for each input variable, using the true conditional expectation function, and it measures the variable's importance in affecting a continuous o… ▽ More

    Submitted 23 December, 2023; v1 submitted 19 December, 2022; originally announced December 2022.

  26. arXiv:2211.11278  [pdf, ps, other

    stat.ML cs.LG

    Optimal Extended Neighbourhood Rule $k$ Nearest Neighbours Ensemble

    Authors: Amjad Ali, Zardad Khan, Dost Muhammad Khan, Saeed Aldahmani

    Abstract: The traditional k nearest neighbor (kNN) approach uses a distance formula within a spherical region to determine the k closest training observations to a test sample point. However, this approach may not work well when test point is located outside this region. Moreover, aggregating many base kNN learners can result in poor ensemble performance due to high classification errors. To address these i… ▽ More

    Submitted 15 February, 2024; v1 submitted 21 November, 2022; originally announced November 2022.

    Comments: This manuscript has been submitted for publication in the esteemed journal Pattern Recognition Letters

    MSC Class: 14J60

  27. arXiv:2210.01620  [pdf, other

    cs.LG cs.AI math.OC stat.ML

    SAM as an Optimal Relaxation of Bayes

    Authors: Thomas Möllenhoff, Mohammad Emtiyaz Khan

    Abstract: Sharpness-aware minimization (SAM) and related adversarial deep-learning methods can drastically improve generalization, but their underlying mechanisms are not yet fully understood. Here, we establish SAM as a relaxation of the Bayes objective where the expected negative-loss is replaced by the optimal convex lower bound, obtained by using the so-called Fenchel biconjugate. The connection enables… ▽ More

    Submitted 10 December, 2023; v1 submitted 4 October, 2022; originally announced October 2022.

    Comments: Accepted at ICLR 2023. Changes: Link to source code (https://github.com/team-approx-bayes/bayesian-sam), fix a typo in Appendix D

  28. arXiv:2208.04998  [pdf, ps, other

    cs.NI cs.MM eess.IV eess.SY stat.AP

    Towards Enabling Next Generation Societal Virtual Reality Applications for Virtual Human Teleportation

    Authors: Jacob Chakareski, Mahmudur Khan, Murat Yuksel

    Abstract: Virtual reality (VR) is an emerging technology of great societal potential. Some of its most exciting and promising use cases include remote scene content and untethered lifelike navigation. This article first highlights the relevance of such future societal applications and the challenges ahead towards enabling them. It then provides a broad and contextual high-level perspective of several emergi… ▽ More

    Submitted 9 August, 2022; originally announced August 2022.

    Comments: This is an extended version (with more details) of a tutorial feature article that will appear in the IEEE Signal Processing Magazine in September 2022

  29. arXiv:2206.05764  [pdf, other

    cs.LG stat.ML

    Mining Multi-Label Samples from Single Positive Labels

    Authors: Youngin Cho, Daejin Kim, Mohammad Azam Khan, Jaegul Choo

    Abstract: Conditional generative adversarial networks (cGANs) have shown superior results in class-conditional generation tasks. To simultaneously control multiple conditions, cGANs require multi-label training datasets, where multiple labels can be assigned to each data instance. Nevertheless, the tremendous annotation cost limits the accessibility of multi-label datasets in real-world scenarios. Therefore… ▽ More

    Submitted 28 May, 2023; v1 submitted 12 June, 2022; originally announced June 2022.

    Comments: NeurIPS 2022

  30. arXiv:2202.08146  [pdf, other

    cs.LG cs.AI eess.SP stat.ML

    A Prospective Approach for Human-to-Human Interaction Recognition from Wi-Fi Channel Data using Attention Bidirectional Gated Recurrent Neural Network with GUI Application Implementation

    Authors: Md. Mohi Uddin Khan, Abdullah Bin Shams, Md. Mohsin Sarker Raihan

    Abstract: Human Activity Recognition (HAR) research has gained significant momentum due to recent technological advancements, artificial intelligence algorithms, the need for smart cities, and socioeconomic transformation. However, existing computer vision and sensor-based HAR solutions have limitations such as privacy issues, memory and power consumption, and discomfort in wearing sensors for which researc… ▽ More

    Submitted 9 May, 2023; v1 submitted 16 February, 2022; originally announced February 2022.

    Comments: 48 Pages. This is the pre-print version article submitted for peer-review to a prestigious journal

  31. arXiv:2112.08211  [pdf, other

    stat.ML cs.AI cs.LG q-bio.QM

    TrialGraph: Machine Intelligence Enabled Insight from Graph Modelling of Clinical Trials

    Authors: Christopher Yacoumatos, Stefano Bragaglia, Anshul Kanakia, Nils Svangård, Jonathan Mangion, Claire Donoghue, Jim Weatherall, Faisal M. Khan, Khader Shameer

    Abstract: A major impediment to successful drug development is the complexity, cost, and scale of clinical trials. The detailed internal structure of clinical trial data can make conventional optimization difficult to achieve. Recent advances in machine learning, specifically graph-structured data analysis, have the potential to enable significant progress in improving the clinical trial design. TrialGraph… ▽ More

    Submitted 15 December, 2021; originally announced December 2021.

    Comments: 17 pages (Manuscript); 3 pages (Supplemental Data); 9 figures

    MSC Class: 68Q04; 05Cxx ACM Class: J.3.1; I.2.0; I.5.1; I.7; H.3

  32. arXiv:2111.04892  [pdf

    stat.AP

    Determinants of Women's Attitude towards Intimate Partner Violence: Evidence from Bangladesh

    Authors: Md Tareq Ferdous Khan, Lianfen Qian

    Abstract: Purpose: The purpose of this study is to identify the important determinants responsible for the variation in women's attitude towards intimate partner violence (IPV). Methods: A nationally representative Bangladesh Demographic and Health Survey 2014 data of 17,863 women is used to address the research questions. In the study, two response variables are constructed from the five attitude questions… ▽ More

    Submitted 8 November, 2021; originally announced November 2021.

    Comments: 22 pages, 6 tables, 2 figures

  33. arXiv:2111.03412  [pdf, other

    cs.LG stat.ML

    Dual Parameterization of Sparse Variational Gaussian Processes

    Authors: Vincent Adam, Paul E. Chang, Mohammad Emtiyaz Khan, Arno Solin

    Abstract: Sparse variational Gaussian process (SVGP) methods are a common choice for non-conjugate Gaussian process inference because of their computational benefits. In this paper, we improve their computational efficiency by using a dual parameterization where each data example is assigned dual parameters, similarly to site parameters used in expectation propagation. Our dual parameterization speeds-up in… ▽ More

    Submitted 19 January, 2022; v1 submitted 5 November, 2021; originally announced November 2021.

    Comments: Advances in Neural Information Processing Systems (NeurIPS 2021)

  34. arXiv:2109.05529  [pdf

    econ.EM stat.AP stat.ML

    Estimating a new panel MSK dataset for comparative analyses of national absorptive capacity systems, economic growth, and development in low and middle income economies

    Authors: Muhammad Salar Khan

    Abstract: Within the national innovation system literature, empirical analyses are severely lacking for developing economies. Particularly, the low- and middle-income countries (LMICs) eligible for the World Bank's International Development Association (IDA) support, are rarely part of any empirical discourse on growth, development, and innovation. One major issue hindering panel analyses in LMICs, and thus… ▽ More

    Submitted 12 September, 2021; originally announced September 2021.

    Comments: 65 pages including figures and tables

  35. arXiv:2108.05660  [pdf, other

    cs.LG cs.AI q-bio.BM stat.ML

    Development of a Risk-Free COVID-19 Screening Algorithm from Routine Blood Tests Using Ensemble Machine Learning

    Authors: Md. Mohsin Sarker Raihan, Md. Mohi Uddin Khan, Laboni Akter, Abdullah Bin Shams

    Abstract: The Reverse Transcription Polymerase Chain Reaction (RTPCR)} test is the silver bullet diagnostic test to discern COVID infection. Rapid antigen detection is a screening test to identify COVID positive patients in little as 15 minutes, but has a lower sensitivity than the PCR tests. Besides having multiple standardized test kits, many people are getting infected and either recovering or dying even… ▽ More

    Submitted 9 May, 2023; v1 submitted 12 August, 2021; originally announced August 2021.

    Comments: Please read the (most updated) published version from here: https://doi.org/10.1201/9781003256083 and cite our article (Chapter-11). Video and BibTex citation format can be found in the description: https://youtu.be/Ci8dznDadJ4

    Journal ref: Applied Intelligence for Industry 4.0. Chapman and Hall/CRC. 2023

  36. arXiv:2108.01124  [pdf

    cs.CR cs.AI cs.LG stat.AP

    Efficacy of Statistical and Artificial Intelligence-based False Information Cyberattack Detection Models for Connected Vehicles

    Authors: Sakib Mahmud Khan, Gurcan Comert, Mashrur Chowdhury

    Abstract: Connected vehicles (CVs), because of the external connectivity with other CVs and connected infrastructure, are vulnerable to cyberattacks that can instantly compromise the safety of the vehicle itself and other connected vehicles and roadway infrastructure. One such cyberattack is the false information attack, where an external attacker injects inaccurate information into the connected vehicles a… ▽ More

    Submitted 2 August, 2021; originally announced August 2021.

    Comments: 18 pages, 6 figures

  37. arXiv:2107.10884  [pdf, other

    stat.ML cs.LG

    Structured second-order methods via natural gradient descent

    Authors: Wu Lin, Frank Nielsen, Mohammad Emtiyaz Khan, Mark Schmidt

    Abstract: In this paper, we propose new structured second-order methods and structured adaptive-gradient methods obtained by performing natural-gradient descent on structured parameter spaces. Natural-gradient descent is an attractive approach to design new algorithms in many settings such as gradient-free, adaptive-gradient, and second-order methods. Our structured methods not only enjoy a structural invar… ▽ More

    Submitted 19 February, 2022; v1 submitted 22 July, 2021; originally announced July 2021.

    Comments: Fixed some typos and added a new figure. ICML 2021 workshop paper. A short version of arXiv:2102.07405 with a focus on optimization tasks

  38. arXiv:2107.08265  [pdf, other

    stat.ML cs.LG

    Subset-of-Data Variational Inference for Deep Gaussian-Processes Regression

    Authors: Ayush Jain, P. K. Srijith, Mohammad Emtiyaz Khan

    Abstract: Deep Gaussian Processes (DGPs) are multi-layer, flexible extensions of Gaussian processes but their training remains challenging. Sparse approximations simplify the training but often require optimization over a large number of inducing inputs and their locations across layers. In this paper, we simplify the training by setting the locations to a fixed subset of data and sampling the inducing inpu… ▽ More

    Submitted 17 July, 2021; originally announced July 2021.

    Comments: Accepted in the 37th Conference on Uncertainty in Artificial Intelligence (UAI 2021)

  39. arXiv:2107.04562  [pdf, other

    stat.ML cs.LG

    The Bayesian Learning Rule

    Authors: Mohammad Emtiyaz Khan, Håvard Rue

    Abstract: We show that many machine-learning algorithms are specific instances of a single algorithm called the \emph{Bayesian learning rule}. The rule, derived from Bayesian principles, yields a wide-range of algorithms from fields such as optimization, deep learning, and graphical models. This includes classical algorithms such as ridge regression, Newton's method, and Kalman filter, as well as modern dee… ▽ More

    Submitted 8 June, 2024; v1 submitted 9 July, 2021; originally announced July 2021.

    Journal ref: Journal of Machine Learning Research 24, no. 281 (2023): 1-46

  40. arXiv:2106.08769  [pdf, other

    cs.LG cs.AI stat.ML

    Knowledge-Adaptation Priors

    Authors: Mohammad Emtiyaz Khan, Siddharth Swaroop

    Abstract: Humans and animals have a natural ability to quickly adapt to their surroundings, but machine-learning models, when subjected to changes, often require a complete retraining from scratch. We present Knowledge-adaptation priors (K-priors) to reduce the cost of retraining by enabling quick and accurate adaptation for a wide-variety of tasks and models. This is made possible by a combination of weigh… ▽ More

    Submitted 27 October, 2021; v1 submitted 16 June, 2021; originally announced June 2021.

  41. arXiv:2106.02613  [pdf, other

    stat.ML cs.LG

    Bridging the Gap Between Target Networks and Functional Regularization

    Authors: Alexandre Piché, Valentin Thomas, Rafael Pardinas, Joseph Marino, Gian Maria Marconi, Christopher Pal, Mohammad Emtiyaz Khan

    Abstract: Bootstrapping is behind much of the successes of deep Reinforcement Learning. However, learning the value function via bootstrapping often leads to unstable training due to fast-changing target values. Target Networks are employed to stabilize training by using an additional set of lagging parameters to estimate the target values. Despite the popularity of Target Networks, their effect on the opti… ▽ More

    Submitted 7 September, 2023; v1 submitted 4 June, 2021; originally announced June 2021.

    Comments: The first two authors contributed equally

  42. arXiv:2104.04975  [pdf, other

    stat.ML cs.LG

    Scalable Marginal Likelihood Estimation for Model Selection in Deep Learning

    Authors: Alexander Immer, Matthias Bauer, Vincent Fortuin, Gunnar Rätsch, Mohammad Emtiyaz Khan

    Abstract: Marginal-likelihood based model-selection, even though promising, is rarely used in deep learning due to estimation difficulties. Instead, most approaches rely on validation data, which may not be readily available. In this work, we present a scalable marginal-likelihood estimation method to select both hyperparameters and network architectures, based on the training data alone. Some hyperparamete… ▽ More

    Submitted 15 June, 2021; v1 submitted 11 April, 2021; originally announced April 2021.

    Comments: ICML 2021

  43. arXiv:2102.07405  [pdf, other

    stat.ML cs.LG

    Tractable structured natural gradient descent using local parameterizations

    Authors: Wu Lin, Frank Nielsen, Mohammad Emtiyaz Khan, Mark Schmidt

    Abstract: Natural-gradient descent (NGD) on structured parameter spaces (e.g., low-rank covariances) is computationally challenging due to difficult Fisher-matrix computations. We address this issue by using \emph{local-parameter coordinates} to obtain a flexible and efficient NGD method that works well for a wide-variety of structured parameterizations. We show four applications where our method (1) genera… ▽ More

    Submitted 17 January, 2022; v1 submitted 15 February, 2021; originally announced February 2021.

    Comments: An extended version of the ICML 2021 paper. Note: A workshop (short) paper with a focus on optimization tasks can be found at arXiv:2107.10884

  44. arXiv:2007.04731  [pdf, other

    cs.LG stat.ML

    Fast Variational Learning in State-Space Gaussian Process Models

    Authors: Paul E. Chang, William J. Wilkinson, Mohammad Emtiyaz Khan, Arno Solin

    Abstract: Gaussian process (GP) regression with 1D inputs can often be performed in linear time via a stochastic differential equation formulation. However, for non-Gaussian likelihoods, this requires application of approximate inference methods which can make the implementation difficult, e.g., expectation propagation can be numerically unstable and variational inference can be computationally inefficient.… ▽ More

    Submitted 17 July, 2020; v1 submitted 9 July, 2020; originally announced July 2020.

    Comments: To appear in MLSP 2020

  45. arXiv:2005.13463  [pdf, other

    stat.AP

    Latent Racial Bias -- Evaluating Racism in Police Stop-and-Searches

    Authors: Akbir Khan

    Abstract: In this paper, we introduce the latent racial bias, a metric and method to evaluate the racial bias within specific events. For the purpose of this paper we explore the British Home Office dataset of stop-and-search incidents. We explore the racial bias in the choice of targets, using a number of statistical models such as graphical probabilistic and TrueSkill Ranking. Firstly, we propose a probab… ▽ More

    Submitted 8 May, 2020; originally announced May 2020.

  46. arXiv:2005.12093  [pdf, ps, other

    math.ST stat.AP

    Mixing properties of Skellam-GARCH processes

    Authors: Paul Doukhan, Naushad Mamode Khan, Michael H. Neumann

    Abstract: We consider integer-valued GARCH processes, where the count variable conditioned on past values of the count and state variables follows a so-called Skellam distribution. Using arguments for contractive Markov chains we prove that the process has a unique stationary regime. Furthermore, we show asymptotic regularity ($β$-mixing) with geometrically decaying coefficients for the count process. These… ▽ More

    Submitted 13 August, 2020; v1 submitted 25 May, 2020; originally announced May 2020.

    MSC Class: 60G10; 60J05

  47. arXiv:2004.14070  [pdf, other

    stat.ML cs.LG

    Continual Deep Learning by Functional Regularisation of Memorable Past

    Authors: Pingbo Pan, Siddharth Swaroop, Alexander Immer, Runa Eschenhagen, Richard E. Turner, Mohammad Emtiyaz Khan

    Abstract: Continually learning new skills is important for intelligent systems, yet standard deep learning methods suffer from catastrophic forgetting of the past. Recent works address this with weight regularisation. Functional regularisation, although computationally expensive, is expected to perform better, but rarely does so in practice. In this paper, we fix this issue by using a new functional-regular… ▽ More

    Submitted 8 January, 2021; v1 submitted 29 April, 2020; originally announced April 2020.

  48. arXiv:2003.09018  [pdf, other

    cs.CV cs.AI cs.LG stat.ML

    Human Activity Recognition from Wearable Sensor Data Using Self-Attention

    Authors: Saif Mahmud, M Tanjid Hasan Tonmoy, Kishor Kumar Bhaumik, A K M Mahbubur Rahman, M Ashraful Amin, Mohammad Shoyaib, Muhammad Asif Hossain Khan, Amin Ahsan Ali

    Abstract: Human Activity Recognition from body-worn sensor data poses an inherent challenge in capturing spatial and temporal dependencies of time-series signals. In this regard, the existing recurrent or convolutional or their hybrid models for activity recognition struggle to capture spatio-temporal context from the feature space of sensor reading sequence. To address this complex problem, we propose a se… ▽ More

    Submitted 17 March, 2020; originally announced March 2020.

    Comments: Accepted for publication at the 24th European Conference on Artificial Intelligence (ECAI-2020); 8 pages, 4 figures

  49. arXiv:2002.12592  [pdf

    eess.SP cs.LG stat.ML

    Wind Speed Prediction using Deep Ensemble Learning with a Jet-like Architecture

    Authors: Aqsa Saeed Qureshi, Asifullah Khan, Muhammad Waleed Khan

    Abstract: The wind is one of the most increasingly used renewable energy resources. Accurate and reliable forecast of wind speed is necessary for efficient power production; however, it is not an easy task because it depends upon meteorological features of the surrounding region. Deep learning is extensively used these days for performing feature extraction. It has also been observed that the integration of… ▽ More

    Submitted 20 March, 2020; v1 submitted 28 February, 2020; originally announced February 2020.

    Comments: Pages: 14, Tables: 6, Figures: 3

  50. Compressing Large-Scale Transformer-Based Models: A Case Study on BERT

    Authors: Prakhar Ganesh, Yao Chen, Xin Lou, Mohammad Ali Khan, Yin Yang, Hassan Sajjad, Preslav Nakov, Deming Chen, Marianne Winslett

    Abstract: Pre-trained Transformer-based models have achieved state-of-the-art performance for various Natural Language Processing (NLP) tasks. However, these models often have billions of parameters, and, thus, are too resource-hungry and computation-intensive to suit low-capability devices or applications with strict latency requirements. One potential remedy for this is model compression, which has attrac… ▽ More

    Submitted 1 June, 2021; v1 submitted 27 February, 2020; originally announced February 2020.

    Comments: To appear in TACL 2021. The arXiv version is a pre-MIT Press publication version