Skip to main content

Showing 1–33 of 33 results for author: Baker, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2503.20031  [pdf, other

    astro-ph.IM cs.CE

    Lossy Compression of Scientific Data: Applications Constrains and Requirements

    Authors: Franck Cappello, Allison Baker, Ebru Bozda, Martin Burtscher, Kyle Chard, Sheng Di, Paul Christopher O Grady, Peng Jiang, Shaomeng Li, Erik Lindahl, Peter Lindstrom, Magnus Lundborg, Kai Zhao, Xin Liang, Masaru Nagaso, Kento Sato, Amarjit Singh, Seung Woo Son, Dingwen Tao, Jiannan Tian, Robert Underwood, Kazutomo Yoshii, Danylo Lykov, Yuri Alexeev, Kyle Gerard Felker

    Abstract: Increasing data volumes from scientific simulations and instruments (supercomputers, accelerators, telescopes) often exceed network, storage, and analysis capabilities. The scientific community's response to this challenge is scientific data reduction. Reduction can take many forms, such as triggering, sampling, filtering, quantization, and dimensionality reduction. This report focuses on a specif… ▽ More

    Submitted 25 March, 2025; originally announced March 2025.

    Comments: 33 pages

  2. arXiv:2412.10105  [pdf, other

    cs.CL

    MALAMUTE: A Multilingual, Highly-granular, Template-free, Education-based Probing Dataset

    Authors: Sagi Shaier, George Arthur Baker, Chiranthan Sridhar, Lawrence E Hunter, Katharina von der Wense

    Abstract: Language models (LMs) have excelled in various broad domains. However, to ensure their safe and effective integration into real-world educational settings, they must demonstrate proficiency in specific, granular areas of knowledge. Existing cloze-style benchmarks, commonly used to evaluate LMs' knowledge, have three major limitations. They: 1) do not cover the educational domain; 2) typically focu… ▽ More

    Submitted 13 December, 2024; originally announced December 2024.

  3. arXiv:2412.10079  [pdf, other

    cs.CL

    Lost in the Middle, and In-Between: Enhancing Language Models' Ability to Reason Over Long Contexts in Multi-Hop QA

    Authors: George Arthur Baker, Ankush Raut, Sagi Shaier, Lawrence E Hunter, Katharina von der Wense

    Abstract: Previous work finds that recent long-context language models fail to make equal use of information in the middle of their inputs, preferring pieces of information located at the tail ends which creates an undue bias in situations where we would like models to be equally capable of using different parts of the input. Thus far, the problem has mainly only been considered in settings with single piec… ▽ More

    Submitted 13 December, 2024; originally announced December 2024.

  4. arXiv:2411.12844  [pdf, other

    cs.HC cs.CL cs.RO

    SCOUT: A Situated and Multi-Modal Human-Robot Dialogue Corpus

    Authors: Stephanie M. Lukin, Claire Bonial, Matthew Marge, Taylor Hudson, Cory J. Hayes, Kimberly A. Pollard, Anthony Baker, Ashley N. Foots, Ron Artstein, Felix Gervits, Mitchell Abrams, Cassidy Henry, Lucia Donatelli, Anton Leuski, Susan G. Hill, David Traum, Clare R. Voss

    Abstract: We introduce the Situated Corpus Of Understanding Transactions (SCOUT), a multi-modal collection of human-robot dialogue in the task domain of collaborative exploration. The corpus was constructed from multiple Wizard-of-Oz experiments where human participants gave verbal instructions to a remotely-located robot to move and gather information about its surroundings. SCOUT contains 89,056 utterance… ▽ More

    Submitted 19 November, 2024; originally announced November 2024.

    Comments: 14 pages, 7 figures

    ACM Class: I.2.7; I.2.9; I.2.10; H.5.2; J.7

    Journal ref: 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024) https://aclanthology.org/2024.lrec-main.1259/

  5. Human-Robot Dialogue Annotation for Multi-Modal Common Ground

    Authors: Claire Bonial, Stephanie M. Lukin, Mitchell Abrams, Anthony Baker, Lucia Donatelli, Ashley Foots, Cory J. Hayes, Cassidy Henry, Taylor Hudson, Matthew Marge, Kimberly A. Pollard, Ron Artstein, David Traum, Clare R. Voss

    Abstract: In this paper, we describe the development of symbolic representations annotated on human-robot dialogue data to make dimensions of meaning accessible to autonomous systems participating in collaborative, natural language dialogue, and to enable common ground with human partners. A particular challenge for establishing common ground arises in remote dialogue (occurring in disaster relief or search… ▽ More

    Submitted 19 November, 2024; originally announced November 2024.

    Comments: 52 pages, 14 figures

    ACM Class: I.2.7; I.2.9; I.2.10; H.5.2; J.7

    Journal ref: Language Resources and Evaluation 2024

  6. arXiv:2407.11988  [pdf, other

    cs.CL

    Generating Harder Cross-document Event Coreference Resolution Datasets using Metaphoric Paraphrasing

    Authors: Shafiuddin Rehan Ahmed, Zhiyong Eric Wang, George Arthur Baker, Kevin Stowe, James H. Martin

    Abstract: The most popular Cross-Document Event Coreference Resolution (CDEC) datasets fail to convey the true difficulty of the task, due to the lack of lexical diversity between coreferring event triggers (words or phrases that refer to an event). Furthermore, there is a dearth of event datasets for figurative language, limiting a crucial avenue of research in event comprehension. We address these two iss… ▽ More

    Submitted 5 June, 2024; originally announced July 2024.

    Comments: Short Paper, ACL 2024

  7. arXiv:2407.01082  [pdf, other

    cs.CL

    Turning Up the Heat: Min-p Sampling for Creative and Coherent LLM Outputs

    Authors: Minh Nguyen, Andrew Baker, Clement Neo, Allen Roush, Andreas Kirsch, Ravid Shwartz-Ziv

    Abstract: Large Language Models (LLMs) generate text by sampling the next token from a probability distribution over the vocabulary at each decoding step. Popular sampling methods like top-p (nucleus sampling) often struggle to balance quality and diversity, especially at higher temperatures which lead to incoherent or repetitive outputs. We propose min-p sampling, a dynamic truncation method that adjusts t… ▽ More

    Submitted 20 March, 2025; v1 submitted 1 July, 2024; originally announced July 2024.

    Comments: Added acknowledgements and minor rewordings to make the intro/abstract more readable. No major change in length or content

  8. arXiv:2404.10970  [pdf, other

    eess.IV cs.HC eess.SP

    Remote Breathing Monitoring Using LiDAR Technology

    Authors: Omar Rinchi, Ahmad Alsharoa, Denise A. Baker

    Abstract: Breathing monitoring is crucial in healthcare for early detection of health issues, but traditional methods face challenges like invasiveness, privacy concerns, and limited applicability in daily settings. This paper introduces light detection and ranging (LiDAR) sensors as a remote, privacy-respecting alternative for monitoring breathing metrics, including inhalation/exhalation patterns, respirat… ▽ More

    Submitted 16 April, 2024; originally announced April 2024.

    Comments: 5 pages, 6 figures, accepted in IEEE EMBC 2024

  9. arXiv:2404.08656  [pdf, other

    cs.CL cs.AI

    Linear Cross-document Event Coreference Resolution with X-AMR

    Authors: Shafiuddin Rehan Ahmed, George Arthur Baker, Evi Judge, Michael Regan, Kristin Wright-Bettner, Martha Palmer, James H. Martin

    Abstract: Event Coreference Resolution (ECR) as a pairwise mention classification task is expensive both for automated systems and manual annotations. The task's quadratic difficulty is exacerbated when using Large Language Models (LLMs), making prompt engineering for ECR prohibitively costly. In this work, we propose a graphical representation of events, X-AMR, anchored around individual mentions using a \… ▽ More

    Submitted 24 March, 2024; originally announced April 2024.

    Comments: LREC-COLING 2024 main conference

  10. arXiv:2403.19509  [pdf

    cs.CL cs.SD eess.AS

    Phonetic Segmentation of the UCLA Phonetics Lab Archive

    Authors: Eleanor Chodroff, Blaž Pažon, Annie Baker, Steven Moran

    Abstract: Research in speech technologies and comparative linguistics depends on access to diverse and accessible speech data. The UCLA Phonetics Lab Archive is one of the earliest multilingual speech corpora, with long-form audio recordings and phonetic transcriptions for 314 languages (Ladefoged et al., 2009). Recently, 95 of these languages were time-aligned with word-level phonetic transcriptions (Li et… ▽ More

    Submitted 28 March, 2024; originally announced March 2024.

    Comments: Accepted at LREC-COLING 2024

  11. arXiv:2402.01040  [pdf

    cs.HC

    Everyday Uses of Music Listening and Music Technologies by Caregivers and People with Dementia: Survey and Focus Group Study

    Authors: Dianna Vidas, Romina Carrasco, Ryan M. Kelly, Jenny Waycott, Jeanette Tamplin, Kate McMahon, Libby M. Flynn, Phoebe A. Stretton-Smith, Tanara Vieira Sousa, Felicity A. Baker

    Abstract: Music is a valuable non-pharmacological tool that provides benefits for people with dementia, and there is interest in designing technologies to support music use in dementia care. To ensure music technologies are appropriately designed for supporting caregivers and people living with dementia, there remains a need to better understand how music is currently used in everyday care at home. We aimed… ▽ More

    Submitted 1 February, 2024; originally announced February 2024.

  12. arXiv:2210.17388  [pdf

    cs.CE

    Combining noisy well data and expert knowledge in a Bayesian calibration of a flow model under uncertainties: an application to solute transport in the Ticino basin

    Authors: Emily A. Baker, Sauro Manenti, Alessandro Reali, Giancarlo Sangalli, Lorenzo Tamellini, Sara Todeschini

    Abstract: Groundwater flow modeling is commonly used to calculate groundwater heads, estimate groundwater flow paths and travel times, and provide insights into solute transport processes within an aquifer. However, the values of input parameters that drive groundwater flow models are often highly uncertain due to subsurface heterogeneity and geologic complexity in combination with lack of measurements/unre… ▽ More

    Submitted 14 March, 2023; v1 submitted 31 October, 2022; originally announced October 2022.

    Comments: First submission

  13. arXiv:2206.12136  [pdf, other

    eess.IV cs.CV

    Feature Representation Learning for Robust Retinal Disease Detection from Optical Coherence Tomography Images

    Authors: Sharif Amit Kamran, Khondker Fariha Hossain, Alireza Tavakkoli, Stewart Lee Zuckerbrod, Salah A. Baker

    Abstract: Ophthalmic images may contain identical-looking pathologies that can cause failure in automated techniques to distinguish different retinal degenerative diseases. Additionally, reliance on large annotated datasets and lack of knowledge distillation can restrict ML-based clinical support systems' deployment in real-world environments. To improve the robustness and transferability of knowledge, an e… ▽ More

    Submitted 31 July, 2022; v1 submitted 24 June, 2022; originally announced June 2022.

    Comments: Accepted to MICCAI2022 Ophthalmic Medical Image Analysis (OMIA) Workshop

  14. arXiv:2206.01990  [pdf

    cs.CE

    Combining the Morris Method and Multiple Error Metrics to Assess Aquifer Characteristics and Recharge in the Lower Ticino Basin, in Italy

    Authors: Emily A. Baker, Alessandro Cappato, Sara Todeschini, Lorenzo Tamellini, Giancarlo Sangalli, Alessandro Reali, Sauro Manenti

    Abstract: Groundwater flow model accuracy is often limited by the uncertainty in model parameters that characterize aquifer properties and aquifer recharge. Aquifer properties such as hydraulic conductivity can have an uncertainty spanning orders of magnitude. Meanwhile, parameters used to configure model boundary conditions can introduce additional uncertainty. In this study, the Morris Method sensitivity… ▽ More

    Submitted 8 September, 2022; v1 submitted 4 June, 2022; originally announced June 2022.

    Comments: second submission after minor revisions

  15. Fast and Robust Femur Segmentation from Computed Tomography Images for Patient-Specific Hip Fracture Risk Screening

    Authors: Pall Asgeir Bjornsson, Alexander Baker, Ingmar Fleps, Yves Pauchard, Halldor Palsson, Stephen J. Ferguson, Sigurdur Sigurdsson, Vilmundur Gudnason, Benedikt Helgason, Lotta Maria Ellingsen

    Abstract: Osteoporosis is a common bone disease that increases the risk of bone fracture. Hip-fracture risk screening methods based on finite element analysis depend on segmented computed tomography (CT) images; however, current femur segmentation methods require manual delineations of large data sets. Here we propose a deep neural network for fully automated, accurate, and fast segmentation of the proximal… ▽ More

    Submitted 20 April, 2022; originally announced April 2022.

    Comments: This article has been accepted for publication in Computer Methods in Biomechanics and Biomedical Engineering: Imaging & Visualization, published by Taylor & Francis

  16. arXiv:2203.10107  [pdf, other

    cs.LG

    SiMCa: Sinkhorn Matrix Factorization with Capacity Constraints

    Authors: Eric Daoud, Luca Ganassali, Antoine Baker, Marc Lelarge

    Abstract: For a very broad range of problems, recommendation algorithms have been increasingly used over the past decade. In most of these algorithms, the predictions are built upon user-item affinity scores which are obtained from high-dimensional embeddings of items and users. In more complex scenarios, with geometrical or capacity constraints, prediction based on embeddings may not be sufficient and some… ▽ More

    Submitted 18 March, 2022; originally announced March 2022.

    Comments: All comments are welcome

  17. arXiv:2202.02616  [pdf, other

    stat.CO cs.CV

    DSSIM: a structural similarity index for floating-point data

    Authors: Allison H. Baker, Alexander Pinard, Dorit M. Hammerling

    Abstract: Data visualization is a critical component in terms of interacting with floating-point output data from large model simulation codes. Indeed, postprocessing analysis workflows on simulation data often generate a large number of images from the raw data, many of which are then compared to each other or to specified reference images. In this image-comparison scenario, image quality assessment (IQA)… ▽ More

    Submitted 19 March, 2023; v1 submitted 5 February, 2022; originally announced February 2022.

  18. VTGAN: Semi-supervised Retinal Image Synthesis and Disease Prediction using Vision Transformers

    Authors: Sharif Amit Kamran, Khondker Fariha Hossain, Alireza Tavakkoli, Stewart Lee Zuckerbrod, Salah A. Baker

    Abstract: In Fluorescein Angiography (FA), an exogenous dye is injected in the bloodstream to image the vascular structure of the retina. The injected dye can cause adverse reactions such as nausea, vomiting, anaphylactic shock, and even death. In contrast, color fundus imaging is a non-invasive technique used for photographing the retina but does not have sufficient fidelity for capturing its vascular stru… ▽ More

    Submitted 13 August, 2021; v1 submitted 14 April, 2021; originally announced April 2021.

    Comments: Accepted to ICCV 2021 Workshop on Computer Vision for Automated Medical Diagnosis

  19. RV-GAN: Segmenting Retinal Vascular Structure in Fundus Photographs using a Novel Multi-scale Generative Adversarial Network

    Authors: Sharif Amit Kamran, Khondker Fariha Hossain, Alireza Tavakkoli, Stewart Lee Zuckerbrod, Kenton M. Sanders, Salah A. Baker

    Abstract: High fidelity segmentation of both macro and microvascular structure of the retina plays a pivotal role in determining degenerative retinal diseases, yet it is a difficult problem. Due to successive resolution loss in the encoding phase combined with the inability to recover this lost information in the decoding phase, autoencoding based segmentation approaches are limited in their ability to extr… ▽ More

    Submitted 14 May, 2021; v1 submitted 2 January, 2021; originally announced January 2021.

    Comments: Accepted to MICCAI2021

  20. arXiv:2009.09422  [pdf, other

    q-bio.PE cond-mat.stat-mech cs.AI cs.LG

    Epidemic mitigation by statistical inference from contact tracing data

    Authors: Antoine Baker, Indaco Biazzo, Alfredo Braunstein, Giovanni Catania, Luca Dall'Asta, Alessandro Ingrosso, Florent Krzakala, Fabio Mazza, Marc Mézard, Anna Paola Muntoni, Maria Refinetti, Stefano Sarao Mannelli, Lenka Zdeborová

    Abstract: Contact-tracing is an essential tool in order to mitigate the impact of pandemic such as the COVID-19. In order to achieve efficient and scalable contact-tracing in real time, digital devices can play an important role. While a lot of attention has been paid to analyzing the privacy and ethical risks of the associated mobile applications, so far much less research has been devoted to optimizing th… ▽ More

    Submitted 20 September, 2020; originally announced September 2020.

    Comments: 21 pages, 7 figures

    ACM Class: G.3; G.4; I.2.11; J.3

    Journal ref: PNAS 2021 Vol. 118 No. 32 e2106548118

  21. Fundus2Angio: A Conditional GAN Architecture for Generating Fluorescein Angiography Images from Retinal Fundus Photography

    Authors: Sharif Amit Kamran, Khondker Fariha Hossain, Alireza Tavakkoli, Stewart Lee Zuckerbrod, Salah A. Baker, Kenton M. Sanders

    Abstract: Carrying out clinical diagnosis of retinal vascular degeneration using Fluorescein Angiography (FA) is a time consuming process and can pose significant adverse effects on the patient. Angiography requires insertion of a dye that may cause severe adverse effects and can even be fatal. Currently, there are no non-invasive systems capable of generating Fluorescein Angiography images. However, retina… ▽ More

    Submitted 29 September, 2020; v1 submitted 11 May, 2020; originally announced May 2020.

    Comments: 14 pages, Accepted to 15th International Symposium on Visual Computing 2020

  22. arXiv:2004.01571  [pdf, other

    stat.ML cond-mat.dis-nn cs.LG eess.SP math.ST stat.CO

    Tree-AMP: Compositional Inference with Tree Approximate Message Passing

    Authors: Antoine Baker, Benjamin Aubin, Florent Krzakala, Lenka Zdeborová

    Abstract: We introduce Tree-AMP, standing for Tree Approximate Message Passing, a python package for compositional inference in high-dimensional tree-structured models. The package provides a unifying framework to study several approximate message passing algorithms previously derived for a variety of machine learning tasks such as generalized linear models, inference in multi-layer networks, matrix factori… ▽ More

    Submitted 11 December, 2021; v1 submitted 3 April, 2020; originally announced April 2020.

    Comments: Source code available at https://github.com/sphinxteam/tramp and documentation at https://sphinxteam.github.io/tramp.docs

    Journal ref: Journal of Machine Learning Research 24 (2023) 1-89

  23. arXiv:2003.12828  [pdf, other

    cs.AI

    Learning medical triage from clinicians using Deep Q-Learning

    Authors: Albert Buchard, Baptiste Bouvier, Giulia Prando, Rory Beard, Michail Livieratos, Dan Busbridge, Daniel Thompson, Jonathan Richens, Yuanzhao Zhang, Adam Baker, Yura Perov, Kostis Gourgoulias, Saurabh Johri

    Abstract: Medical Triage is of paramount importance to healthcare systems, allowing for the correct orientation of patients and allocation of the necessary resources to treat them adequately. While reliable decision-tree methods exist to triage patients based on their presentation, those trees implicitly require human inference and are not immediately applicable in a fully automated setting. On the other ha… ▽ More

    Submitted 24 June, 2020; v1 submitted 28 March, 2020; originally announced March 2020.

    Comments: 17 pages, 4 figures, 3 tables, preprint, in press

    MSC Class: 93E35

  24. arXiv:1912.02008  [pdf, other

    math.ST cond-mat.dis-nn cs.LG eess.SP stat.ML

    Exact asymptotics for phase retrieval and compressed sensing with random generative priors

    Authors: Benjamin Aubin, Bruno Loureiro, Antoine Baker, Florent Krzakala, Lenka Zdeborová

    Abstract: We consider the problem of compressed sensing and of (real-valued) phase retrieval with random measurement matrix. We derive sharp asymptotics for the information-theoretically optimal performance and for the best known polynomial algorithm for an ensemble of generative priors consisting of fully connected deep neural networks with random weight matrices and arbitrary activations. We compare the p… ▽ More

    Submitted 12 June, 2020; v1 submitted 4 December, 2019; originally announced December 2019.

    Comments: 13+3 pages, 7 figures, v2 revised and accepted at MSML

    Journal ref: Proceedings of The First Mathematical and Scientific Machine Learning Conference, PMLR 107:55-73, 2020

  25. arXiv:1910.08091  [pdf, other

    cs.AI cs.LG cs.PL stat.CO stat.ML

    MultiVerse: Causal Reasoning using Importance Sampling in Probabilistic Programming

    Authors: Yura Perov, Logan Graham, Kostis Gourgoulias, Jonathan G. Richens, Ciarán M. Lee, Adam Baker, Saurabh Johri

    Abstract: We elaborate on using importance sampling for causal reasoning, in particular for counterfactual inference. We show how this can be implemented natively in probabilistic programming. By considering the structure of the counterfactual query, one can significantly optimise the inference process. We also consider design choices to enable further optimisations. We introduce MultiVerse, a probabilistic… ▽ More

    Submitted 28 January, 2020; v1 submitted 17 October, 2019; originally announced October 2019.

    Comments: Logan and Yura have made equal contributions to the paper. Accepted to the 2nd Symposium on Advances in Approximate Bayesian Inference (Vancouver, Canada, 2019)

  26. arXiv:1910.07474  [pdf, other

    cs.LG cs.AI stat.ML

    Universal Marginaliser for Deep Amortised Inference for Probabilistic Programs

    Authors: Robert Walecki, Kostis Gourgoulias, Adam Baker, Chris Hart, Chris Lucas, Max Zwiessele, Albert Buchard, Maria Lomeli, Yura Perov, Saurabh Johri

    Abstract: Probabilistic programming languages (PPLs) are powerful modelling tools which allow to formalise our knowledge about the world and reason about its inherent uncertainty. Inference methods used in PPL can be computationally costly due to significant time burden and/or storage requirements; or they can lack theoretical guarantees of convergence and accuracy when applied to large scale graphical mode… ▽ More

    Submitted 16 October, 2019; originally announced October 2019.

  27. arXiv:1906.04735  [pdf, other

    stat.ML cs.IT cs.LG eess.SP math.ST

    On the Universality of Noiseless Linear Estimation with Respect to the Measurement Matrix

    Authors: Alia Abbara, Antoine Baker, Florent Krzakala, Lenka Zdeborová

    Abstract: In a noiseless linear estimation problem, one aims to reconstruct a vector x* from the knowledge of its linear projections y=Phi x*. There have been many theoretical works concentrating on the case where the matrix Phi is a random i.i.d. one, but a number of heuristic evidence suggests that many of these results are universal and extend well beyond this restricted case. Here we revisit this proble… ▽ More

    Submitted 11 June, 2019; originally announced June 2019.

    Comments: 13 pages, 4 figures

    Journal ref: Journal of Physics A: Mathematical and Theoretical (2019)

  28. arXiv:1904.01131  [pdf, other

    quant-ph cs.ET physics.chem-ph physics.comp-ph

    Q# and NWChem: Tools for Scalable Quantum Chemistry on Quantum Computers

    Authors: Guang Hao Low, Nicholas P. Bauman, Christopher E. Granade, Bo Peng, Nathan Wiebe, Eric J. Bylaska, Dave Wecker, Sriram Krishnamoorthy, Martin Roetteler, Karol Kowalski, Matthias Troyer, Nathan A. Baker

    Abstract: Fault-tolerant quantum computation promises to solve outstanding problems in quantum chemistry within the next decade. Realizing this promise requires scalable tools that allow users to translate descriptions of electronic structure problems to optimized quantum gate sequences executed on physical hardware, without requiring specialized quantum computing knowledge. To this end, we present a quantu… ▽ More

    Submitted 1 April, 2019; originally announced April 2019.

    Comments: 36 pages, 5 figures. Examples and data in ancillary files folder

  29. arXiv:1810.13432  [pdf, other

    cs.DC cs.PL

    Making root cause analysis feasible for large code bases: a solution approach for a climate model

    Authors: Daniel J. Milroy, Allison H. Baker, Dorit M. Hammerling, Youngsung Kim, Elizabeth R. Jessup, Thomas Hauser

    Abstract: For large-scale simulation codes with huge and complex code bases, where bit-for-bit comparisons are too restrictive, finding the source of statistically significant discrepancies (e.g., from a previous version, alternative hardware or supporting software stack) in output is non-trivial at best. Although there are many tools for program comprehension through debugging or slicing, few (if any) scal… ▽ More

    Submitted 11 February, 2019; v1 submitted 31 October, 2018; originally announced October 2018.

  30. arXiv:1806.10698  [pdf, other

    cs.AI cs.LG stat.AP stat.ML

    A comparative study of artificial intelligence and human doctors for the purpose of triage and diagnosis

    Authors: Salman Razzaki, Adam Baker, Yura Perov, Katherine Middleton, Janie Baxter, Daniel Mullarkey, Davinder Sangar, Michael Taliercio, Mobasher Butt, Azeem Majeed, Arnold DoRosario, Megan Mahoney, Saurabh Johri

    Abstract: Online symptom checkers have significant potential to improve patient care, however their reliability and accuracy remain variable. We hypothesised that an artificial intelligence (AI) powered triage and diagnostic system would compare favourably with human doctors with respect to triage and diagnostic accuracy. We performed a prospective validation study of the accuracy and safety of an AI powere… ▽ More

    Submitted 27 June, 2018; originally announced June 2018.

  31. arXiv:1711.00695  [pdf, other

    cs.LG stat.ML

    A Universal Marginalizer for Amortized Inference in Generative Models

    Authors: Laura Douglas, Iliyan Zarov, Konstantinos Gourgoulias, Chris Lucas, Chris Hart, Adam Baker, Maneesh Sahani, Yura Perov, Saurabh Johri

    Abstract: We consider the problem of inference in a causal generative model where the set of available observations differs between data instances. We show how combining samples drawn from the graphical model with an appropriate masking function makes it possible to train a single neural network to approximate all the corresponding conditional marginal distributions and thus amortize the cost of inference.… ▽ More

    Submitted 2 November, 2017; originally announced November 2017.

    Comments: Submitted to the NIPS 2017 Workshop on Advances in Approximate Bayesian Inference

  32. arXiv:1308.5249  [pdf, ps, other

    cs.IT

    A Note on Sparsification by Frames

    Authors: Christopher A. Baker

    Abstract: The purpose of this note is to establish a new generalized Dictionary-Restricted Isometry Property (D-RIP) sparsity bound constant for compressed sensing. For fulfilling D-RIP, the constant $δ_k$ is used in the definition: $(1 -δ_k)\|D v\|_2^2 \le \|ΦD v\|_2^2 \le (1 + δ_k)\|D v\|^2$. We prove that signals with $k$-sparse $D$-representation can be reconstructed if $δ_{2k} < \frac{2}3$. The app… ▽ More

    Submitted 19 December, 2014; v1 submitted 23 August, 2013; originally announced August 2013.

  33. arXiv:0912.0284  [pdf, ps, other

    math.KT cs.CG math.GT math.NA stat.ML

    Hodge Theory on Metric Spaces

    Authors: Laurent Bartholdi, Thomas Schick, Nat Smale, Steve Smale, Anthony W. Baker

    Abstract: Hodge theory is a beautiful synthesis of geometry, topology, and analysis, which has been developed in the setting of Riemannian manifolds. On the other hand, spaces of images, which are important in the mathematical foundations of vision and pattern recognition, do not fit this framework. This motivates us to develop a version of Hodge theory on metric spaces with a probability measure. We believ… ▽ More

    Submitted 24 November, 2011; v1 submitted 1 December, 2009; originally announced December 2009.

    Comments: appendix by Anthony W. Baker, 48 pages, AMS-LaTeX. v2: final version, to appear in Foundations of Computational Mathematics. Minor changes and additions

    MSC Class: 58A14; 54E05; 55P55; 57M50

    Journal ref: Foundations of Computational Mathematics 12:1 (2012) 1-48