-
Lossy Compression of Scientific Data: Applications Constrains and Requirements
Authors:
Franck Cappello,
Allison Baker,
Ebru Bozda,
Martin Burtscher,
Kyle Chard,
Sheng Di,
Paul Christopher O Grady,
Peng Jiang,
Shaomeng Li,
Erik Lindahl,
Peter Lindstrom,
Magnus Lundborg,
Kai Zhao,
Xin Liang,
Masaru Nagaso,
Kento Sato,
Amarjit Singh,
Seung Woo Son,
Dingwen Tao,
Jiannan Tian,
Robert Underwood,
Kazutomo Yoshii,
Danylo Lykov,
Yuri Alexeev,
Kyle Gerard Felker
Abstract:
Increasing data volumes from scientific simulations and instruments (supercomputers, accelerators, telescopes) often exceed network, storage, and analysis capabilities. The scientific community's response to this challenge is scientific data reduction. Reduction can take many forms, such as triggering, sampling, filtering, quantization, and dimensionality reduction. This report focuses on a specif…
▽ More
Increasing data volumes from scientific simulations and instruments (supercomputers, accelerators, telescopes) often exceed network, storage, and analysis capabilities. The scientific community's response to this challenge is scientific data reduction. Reduction can take many forms, such as triggering, sampling, filtering, quantization, and dimensionality reduction. This report focuses on a specific technique: lossy compression. Lossy compression retains all data points, leveraging correlations and controlled reduced accuracy. Quality constraints, especially for quantities of interest, are crucial for preserving scientific discoveries. User requirements also include compression ratio and speed. While many papers have been published on lossy compression techniques and reference datasets are shared by the community, there is a lack of detailed specifications of application needs that can guide lossy compression researchers and developers. This report fills this gap by reporting on the requirements and constraints of nine scientific applications covering a large spectrum of domains (climate, combustion, cosmology, fusion, light sources, molecular dynamics, quantum circuit simulation, seismology, and system logs). The report also details key lossy compression technologies (SZ, ZFP, MGARD, LC, SPERR, DCTZ, TEZip, LibPressio), discussing their history, principles, error control, hardware support, features, and impact. By presenting both application needs and compression technologies, the report aims to inspire new research to fill existing gaps.
△ Less
Submitted 25 March, 2025;
originally announced March 2025.
-
MALAMUTE: A Multilingual, Highly-granular, Template-free, Education-based Probing Dataset
Authors:
Sagi Shaier,
George Arthur Baker,
Chiranthan Sridhar,
Lawrence E Hunter,
Katharina von der Wense
Abstract:
Language models (LMs) have excelled in various broad domains. However, to ensure their safe and effective integration into real-world educational settings, they must demonstrate proficiency in specific, granular areas of knowledge. Existing cloze-style benchmarks, commonly used to evaluate LMs' knowledge, have three major limitations. They: 1) do not cover the educational domain; 2) typically focu…
▽ More
Language models (LMs) have excelled in various broad domains. However, to ensure their safe and effective integration into real-world educational settings, they must demonstrate proficiency in specific, granular areas of knowledge. Existing cloze-style benchmarks, commonly used to evaluate LMs' knowledge, have three major limitations. They: 1) do not cover the educational domain; 2) typically focus on low-complexity, generic knowledge or broad domains, which do not adequately assess the models' knowledge in specific subjects; and 3) often rely on templates that can bias model predictions. Here, we introduce MALAMUTE, a multilingual, template-free, and highly granular probing dataset comprising expert-written, peer-reviewed probes from 71 university-level textbooks across three languages (English, Spanish, and Polish). MALAMUTE is the first education-based cloze-style dataset. It covers eight domains, each with up to 14 subdomains, further broken down into concepts and concept-based prompts, totaling 33,361 university curriculum concepts and 116,887 prompts. MALAMUTE's fine granularity, educational focus, and inclusion of both sentence-level and paragraph-level prompts make it an ideal tool for evaluating LMs' course-related knowledge. Our evaluation of masked and causal LMs on MALAMUTE shows that despite overall proficiency, they have significant gaps in knowledge when examined closely on specific subjects, hindering their safe use in classrooms and underscoring the need for further development.
△ Less
Submitted 13 December, 2024;
originally announced December 2024.
-
Lost in the Middle, and In-Between: Enhancing Language Models' Ability to Reason Over Long Contexts in Multi-Hop QA
Authors:
George Arthur Baker,
Ankush Raut,
Sagi Shaier,
Lawrence E Hunter,
Katharina von der Wense
Abstract:
Previous work finds that recent long-context language models fail to make equal use of information in the middle of their inputs, preferring pieces of information located at the tail ends which creates an undue bias in situations where we would like models to be equally capable of using different parts of the input. Thus far, the problem has mainly only been considered in settings with single piec…
▽ More
Previous work finds that recent long-context language models fail to make equal use of information in the middle of their inputs, preferring pieces of information located at the tail ends which creates an undue bias in situations where we would like models to be equally capable of using different parts of the input. Thus far, the problem has mainly only been considered in settings with single pieces of critical information, leading us to question what happens when multiple necessary pieces of information are spread out over the inputs. Here, we demonstrate the effects of the "lost in the middle" problem in the multi-hop question answering setting -- in which multiple reasoning "hops" over disconnected documents are required -- and show that performance degrades not only with respect to the distance of information from the edges of the context, but also between pieces of information. Additionally, we experiment with means of alleviating the problem by reducing superfluous document contents through knowledge graph triple extraction and summarization, and prompting models to reason more thoroughly using chain-of-thought prompting.
△ Less
Submitted 13 December, 2024;
originally announced December 2024.
-
SCOUT: A Situated and Multi-Modal Human-Robot Dialogue Corpus
Authors:
Stephanie M. Lukin,
Claire Bonial,
Matthew Marge,
Taylor Hudson,
Cory J. Hayes,
Kimberly A. Pollard,
Anthony Baker,
Ashley N. Foots,
Ron Artstein,
Felix Gervits,
Mitchell Abrams,
Cassidy Henry,
Lucia Donatelli,
Anton Leuski,
Susan G. Hill,
David Traum,
Clare R. Voss
Abstract:
We introduce the Situated Corpus Of Understanding Transactions (SCOUT), a multi-modal collection of human-robot dialogue in the task domain of collaborative exploration. The corpus was constructed from multiple Wizard-of-Oz experiments where human participants gave verbal instructions to a remotely-located robot to move and gather information about its surroundings. SCOUT contains 89,056 utterance…
▽ More
We introduce the Situated Corpus Of Understanding Transactions (SCOUT), a multi-modal collection of human-robot dialogue in the task domain of collaborative exploration. The corpus was constructed from multiple Wizard-of-Oz experiments where human participants gave verbal instructions to a remotely-located robot to move and gather information about its surroundings. SCOUT contains 89,056 utterances and 310,095 words from 278 dialogues averaging 320 utterances per dialogue. The dialogues are aligned with the multi-modal data streams available during the experiments: 5,785 images and 30 maps. The corpus has been annotated with Abstract Meaning Representation and Dialogue-AMR to identify the speaker's intent and meaning within an utterance, and with Transactional Units and Relations to track relationships between utterances to reveal patterns of the Dialogue Structure. We describe how the corpus and its annotations have been used to develop autonomous human-robot systems and enable research in open questions of how humans speak to robots. We release this corpus to accelerate progress in autonomous, situated, human-robot dialogue, especially in the context of navigation tasks where details about the environment need to be discovered.
△ Less
Submitted 19 November, 2024;
originally announced November 2024.
-
Human-Robot Dialogue Annotation for Multi-Modal Common Ground
Authors:
Claire Bonial,
Stephanie M. Lukin,
Mitchell Abrams,
Anthony Baker,
Lucia Donatelli,
Ashley Foots,
Cory J. Hayes,
Cassidy Henry,
Taylor Hudson,
Matthew Marge,
Kimberly A. Pollard,
Ron Artstein,
David Traum,
Clare R. Voss
Abstract:
In this paper, we describe the development of symbolic representations annotated on human-robot dialogue data to make dimensions of meaning accessible to autonomous systems participating in collaborative, natural language dialogue, and to enable common ground with human partners. A particular challenge for establishing common ground arises in remote dialogue (occurring in disaster relief or search…
▽ More
In this paper, we describe the development of symbolic representations annotated on human-robot dialogue data to make dimensions of meaning accessible to autonomous systems participating in collaborative, natural language dialogue, and to enable common ground with human partners. A particular challenge for establishing common ground arises in remote dialogue (occurring in disaster relief or search-and-rescue tasks), where a human and robot are engaged in a joint navigation and exploration task of an unfamiliar environment, but where the robot cannot immediately share high quality visual information due to limited communication constraints. Engaging in a dialogue provides an effective way to communicate, while on-demand or lower-quality visual information can be supplemented for establishing common ground. Within this paradigm, we capture propositional semantics and the illocutionary force of a single utterance within the dialogue through our Dialogue-AMR annotation, an augmentation of Abstract Meaning Representation. We then capture patterns in how different utterances within and across speaker floors relate to one another in our development of a multi-floor Dialogue Structure annotation schema. Finally, we begin to annotate and analyze the ways in which the visual modalities provide contextual information to the dialogue for overcoming disparities in the collaborators' understanding of the environment. We conclude by discussing the use-cases, architectures, and systems we have implemented from our annotations that enable physical robots to autonomously engage with humans in bi-directional dialogue and navigation.
△ Less
Submitted 19 November, 2024;
originally announced November 2024.
-
Generating Harder Cross-document Event Coreference Resolution Datasets using Metaphoric Paraphrasing
Authors:
Shafiuddin Rehan Ahmed,
Zhiyong Eric Wang,
George Arthur Baker,
Kevin Stowe,
James H. Martin
Abstract:
The most popular Cross-Document Event Coreference Resolution (CDEC) datasets fail to convey the true difficulty of the task, due to the lack of lexical diversity between coreferring event triggers (words or phrases that refer to an event). Furthermore, there is a dearth of event datasets for figurative language, limiting a crucial avenue of research in event comprehension. We address these two iss…
▽ More
The most popular Cross-Document Event Coreference Resolution (CDEC) datasets fail to convey the true difficulty of the task, due to the lack of lexical diversity between coreferring event triggers (words or phrases that refer to an event). Furthermore, there is a dearth of event datasets for figurative language, limiting a crucial avenue of research in event comprehension. We address these two issues by introducing ECB+META, a lexically rich variant of Event Coref Bank Plus (ECB+) for CDEC on symbolic and metaphoric language. We use ChatGPT as a tool for the metaphoric transformation of sentences in the documents of ECB+, then tag the original event triggers in the transformed sentences in a semi-automated manner. In this way, we avoid the re-annotation of expensive coreference links. We present results that show existing methods that work well on ECB+ struggle with ECB+META, thereby paving the way for CDEC research on a much more challenging dataset. Code/data: https://github.com/ahmeshaf/llms_coref
△ Less
Submitted 5 June, 2024;
originally announced July 2024.
-
Turning Up the Heat: Min-p Sampling for Creative and Coherent LLM Outputs
Authors:
Minh Nguyen,
Andrew Baker,
Clement Neo,
Allen Roush,
Andreas Kirsch,
Ravid Shwartz-Ziv
Abstract:
Large Language Models (LLMs) generate text by sampling the next token from a probability distribution over the vocabulary at each decoding step. Popular sampling methods like top-p (nucleus sampling) often struggle to balance quality and diversity, especially at higher temperatures which lead to incoherent or repetitive outputs. We propose min-p sampling, a dynamic truncation method that adjusts t…
▽ More
Large Language Models (LLMs) generate text by sampling the next token from a probability distribution over the vocabulary at each decoding step. Popular sampling methods like top-p (nucleus sampling) often struggle to balance quality and diversity, especially at higher temperatures which lead to incoherent or repetitive outputs. We propose min-p sampling, a dynamic truncation method that adjusts the sampling threshold based on the model's confidence by using the top token's probability as a scaling factor. Our experiments on benchmarks including GPQA, GSM8K, and AlpacaEval Creative Writing show that min-p sampling improves both the quality and diversity of generated text across different model families (Mistral and Llama 3) and model sizes (1B to 123B parameters), especially at higher temperatures. Human evaluations further show a clear preference for min-p sampling, in both text quality and creativity. Min-p sampling has been adopted by popular open-source LLM frameworks, including Hugging Face Transformers, VLLM, and many others, highlighting its significant impact on improving text generation quality.
△ Less
Submitted 20 March, 2025; v1 submitted 1 July, 2024;
originally announced July 2024.
-
Remote Breathing Monitoring Using LiDAR Technology
Authors:
Omar Rinchi,
Ahmad Alsharoa,
Denise A. Baker
Abstract:
Breathing monitoring is crucial in healthcare for early detection of health issues, but traditional methods face challenges like invasiveness, privacy concerns, and limited applicability in daily settings. This paper introduces light detection and ranging (LiDAR) sensors as a remote, privacy-respecting alternative for monitoring breathing metrics, including inhalation/exhalation patterns, respirat…
▽ More
Breathing monitoring is crucial in healthcare for early detection of health issues, but traditional methods face challenges like invasiveness, privacy concerns, and limited applicability in daily settings. This paper introduces light detection and ranging (LiDAR) sensors as a remote, privacy-respecting alternative for monitoring breathing metrics, including inhalation/exhalation patterns, respiratory rates, breath depth, and detecting breathlessness. We highlight LiDARs ability to function across various postures, presenting empirical evidence of its accuracy and reliability. Our findings position LiDAR as an innovative solution in breathing monitoring, offering significant advantages over conventional methods.
△ Less
Submitted 16 April, 2024;
originally announced April 2024.
-
Linear Cross-document Event Coreference Resolution with X-AMR
Authors:
Shafiuddin Rehan Ahmed,
George Arthur Baker,
Evi Judge,
Michael Regan,
Kristin Wright-Bettner,
Martha Palmer,
James H. Martin
Abstract:
Event Coreference Resolution (ECR) as a pairwise mention classification task is expensive both for automated systems and manual annotations. The task's quadratic difficulty is exacerbated when using Large Language Models (LLMs), making prompt engineering for ECR prohibitively costly. In this work, we propose a graphical representation of events, X-AMR, anchored around individual mentions using a \…
▽ More
Event Coreference Resolution (ECR) as a pairwise mention classification task is expensive both for automated systems and manual annotations. The task's quadratic difficulty is exacerbated when using Large Language Models (LLMs), making prompt engineering for ECR prohibitively costly. In this work, we propose a graphical representation of events, X-AMR, anchored around individual mentions using a \textbf{cross}-document version of \textbf{A}bstract \textbf{M}eaning \textbf{R}epresentation. We then linearize the ECR with a novel multi-hop coreference algorithm over the event graphs. The event graphs simplify ECR, making it a) LLM cost-effective, b) compositional and interpretable, and c) easily annotated. For a fair assessment, we first enrich an existing ECR benchmark dataset with these event graphs using an annotator-friendly tool we introduce. Then, we employ GPT-4, the newest LLM by OpenAI, for these annotations. Finally, using the ECR algorithm, we assess GPT-4 against humans and analyze its limitations. Through this research, we aim to advance the state-of-the-art for efficient ECR and shed light on the potential shortcomings of current LLMs at this task. Code and annotations: \url{https://github.com/ahmeshaf/gpt_coref}
△ Less
Submitted 24 March, 2024;
originally announced April 2024.
-
Phonetic Segmentation of the UCLA Phonetics Lab Archive
Authors:
Eleanor Chodroff,
Blaž Pažon,
Annie Baker,
Steven Moran
Abstract:
Research in speech technologies and comparative linguistics depends on access to diverse and accessible speech data. The UCLA Phonetics Lab Archive is one of the earliest multilingual speech corpora, with long-form audio recordings and phonetic transcriptions for 314 languages (Ladefoged et al., 2009). Recently, 95 of these languages were time-aligned with word-level phonetic transcriptions (Li et…
▽ More
Research in speech technologies and comparative linguistics depends on access to diverse and accessible speech data. The UCLA Phonetics Lab Archive is one of the earliest multilingual speech corpora, with long-form audio recordings and phonetic transcriptions for 314 languages (Ladefoged et al., 2009). Recently, 95 of these languages were time-aligned with word-level phonetic transcriptions (Li et al., 2021). Here we present VoxAngeles, a corpus of audited phonetic transcriptions and phone-level alignments of the UCLA Phonetics Lab Archive, which uses the 95-language CMU re-release as our starting point. VoxAngeles also includes word- and phone-level segmentations from the original UCLA corpus, as well as phonetic measurements of word and phone durations, vowel formants, and vowel f0. This corpus enhances the usability of the original data, particularly for quantitative phonetic typology, as demonstrated through a case study of vowel intrinsic f0. We also discuss the utility of the VoxAngeles corpus for general research and pedagogy in crosslinguistic phonetics, as well as for low-resource and multilingual speech technologies. VoxAngeles is free to download and use under a CC-BY-NC 4.0 license.
△ Less
Submitted 28 March, 2024;
originally announced March 2024.
-
Everyday Uses of Music Listening and Music Technologies by Caregivers and People with Dementia: Survey and Focus Group Study
Authors:
Dianna Vidas,
Romina Carrasco,
Ryan M. Kelly,
Jenny Waycott,
Jeanette Tamplin,
Kate McMahon,
Libby M. Flynn,
Phoebe A. Stretton-Smith,
Tanara Vieira Sousa,
Felicity A. Baker
Abstract:
Music is a valuable non-pharmacological tool that provides benefits for people with dementia, and there is interest in designing technologies to support music use in dementia care. To ensure music technologies are appropriately designed for supporting caregivers and people living with dementia, there remains a need to better understand how music is currently used in everyday care at home. We aimed…
▽ More
Music is a valuable non-pharmacological tool that provides benefits for people with dementia, and there is interest in designing technologies to support music use in dementia care. To ensure music technologies are appropriately designed for supporting caregivers and people living with dementia, there remains a need to better understand how music is currently used in everyday care at home. We aimed to understand how people with dementia and their caregivers use music technologies in everyday caring, as well as challenges they experience using music and technology. This study used a mixed methods design. A survey was completed by 77 caregivers and people with dementia to understand their use of music and technology. Of these, 18 survey respondents (12 family caregivers, 6 people living with dementia) participated in focus groups about their experiences of using music and technology in care. Transcripts were analysed with reflexive thematic analysis. Most survey respondents used music often in their daily lives, reporting a range of music technologies such as CDs, radio, and streaming. Focus groups highlighted benefits and challenges of music technologies in everyday care. Participants used music and music technologies to regulate mood, provide joy, facilitate social connection, encourage reminiscence, provide continuity before and after diagnosis, and to make caregiving easier. Challenges of using music technology in care included difficulties staying up to date with evolving technology, and low self-efficacy for technology use expressed by people living with dementia. Evidently, people living with dementia and their caregivers use music technologies to support their everyday care needs. Results suggest opportunities to design technologies enabling easier access to music and supporting people living with dementia with recreational and therapeutic music listening and music-based activities.
△ Less
Submitted 1 February, 2024;
originally announced February 2024.
-
Combining noisy well data and expert knowledge in a Bayesian calibration of a flow model under uncertainties: an application to solute transport in the Ticino basin
Authors:
Emily A. Baker,
Sauro Manenti,
Alessandro Reali,
Giancarlo Sangalli,
Lorenzo Tamellini,
Sara Todeschini
Abstract:
Groundwater flow modeling is commonly used to calculate groundwater heads, estimate groundwater flow paths and travel times, and provide insights into solute transport processes within an aquifer. However, the values of input parameters that drive groundwater flow models are often highly uncertain due to subsurface heterogeneity and geologic complexity in combination with lack of measurements/unre…
▽ More
Groundwater flow modeling is commonly used to calculate groundwater heads, estimate groundwater flow paths and travel times, and provide insights into solute transport processes within an aquifer. However, the values of input parameters that drive groundwater flow models are often highly uncertain due to subsurface heterogeneity and geologic complexity in combination with lack of measurements/unreliable measurements. This uncertainty affects the accuracy and reliability of model outputs. Therefore, parameters' uncertainty must be quantified before adopting the model as an engineering tool. In this study, we model the uncertain parameters as random variables and use a Bayesian inversion approach to obtain a posterior,data-informed, probability density function (pdf) for them: in particular, the likelihood function we consider takes into account both well measurements and our prior knowledge about the extent of the springs in the domain under study. To keep the modelistic and computational complexities under control, we assume Gaussianity of the posterior pdf of the parameters. To corroborate this assumption, we run an identifiability analysis of the model: we apply the inversion procedure to several sets of synthetic data polluted by increasing levels of noise, and we determine at which levels of noise we can effectively recover the "true value" of the parameters. We then move to real well data (coming from the Ticino River basin, in northern Italy, and spanning a month in summer 2014), and use the posterior pdf of the parameters as a starting point to perform an Uncertainty Quantification analysis on groundwater travel-time distributions.
△ Less
Submitted 14 March, 2023; v1 submitted 31 October, 2022;
originally announced October 2022.
-
Feature Representation Learning for Robust Retinal Disease Detection from Optical Coherence Tomography Images
Authors:
Sharif Amit Kamran,
Khondker Fariha Hossain,
Alireza Tavakkoli,
Stewart Lee Zuckerbrod,
Salah A. Baker
Abstract:
Ophthalmic images may contain identical-looking pathologies that can cause failure in automated techniques to distinguish different retinal degenerative diseases. Additionally, reliance on large annotated datasets and lack of knowledge distillation can restrict ML-based clinical support systems' deployment in real-world environments. To improve the robustness and transferability of knowledge, an e…
▽ More
Ophthalmic images may contain identical-looking pathologies that can cause failure in automated techniques to distinguish different retinal degenerative diseases. Additionally, reliance on large annotated datasets and lack of knowledge distillation can restrict ML-based clinical support systems' deployment in real-world environments. To improve the robustness and transferability of knowledge, an enhanced feature-learning module is required to extract meaningful spatial representations from the retinal subspace. Such a module, if used effectively, can detect unique disease traits and differentiate the severity of such retinal degenerative pathologies. In this work, we propose a robust disease detection architecture with three learning heads, i) A supervised encoder for retinal disease classification, ii) An unsupervised decoder for the reconstruction of disease-specific spatial information, and iii) A novel representation learning module for learning the similarity between encoder-decoder feature and enhancing the accuracy of the model. Our experimental results on two publicly available OCT datasets illustrate that the proposed model outperforms existing state-of-the-art models in terms of accuracy, interpretability, and robustness for out-of-distribution retinal disease detection.
△ Less
Submitted 31 July, 2022; v1 submitted 24 June, 2022;
originally announced June 2022.
-
Combining the Morris Method and Multiple Error Metrics to Assess Aquifer Characteristics and Recharge in the Lower Ticino Basin, in Italy
Authors:
Emily A. Baker,
Alessandro Cappato,
Sara Todeschini,
Lorenzo Tamellini,
Giancarlo Sangalli,
Alessandro Reali,
Sauro Manenti
Abstract:
Groundwater flow model accuracy is often limited by the uncertainty in model parameters that characterize aquifer properties and aquifer recharge. Aquifer properties such as hydraulic conductivity can have an uncertainty spanning orders of magnitude. Meanwhile, parameters used to configure model boundary conditions can introduce additional uncertainty. In this study, the Morris Method sensitivity…
▽ More
Groundwater flow model accuracy is often limited by the uncertainty in model parameters that characterize aquifer properties and aquifer recharge. Aquifer properties such as hydraulic conductivity can have an uncertainty spanning orders of magnitude. Meanwhile, parameters used to configure model boundary conditions can introduce additional uncertainty. In this study, the Morris Method sensitivity analysis is performed on multiple quantities of interest to assess the sensitivity of a steady-state groundwater flow model to uncertain input parameters. The Morris Method determines which of these parameters are less influential on model outputs. Uninfluential parameters can be set constant during subsequent parameter optimization to reduce computational expense. Combining multiple quantities of interest (e.g., RMSE, groundwater fluxes) when performing both the Morris Method and parameter optimization offers a more complete assessment of groundwater models, providing a more reliable and physically consistent estimate of uncertain parameters. The parameter optimization procedure also provides us an estimate of the residual uncertainty in the parameter values, resulting in a more complete estimate of the remaining uncertainty. By employing such techniques, the current study was able to estimate the aquifer hydraulic conductivity and recharge rate due to rice field irrigation in a groundwater basin in Northern Italy, revealing that a significant proportion of surficial aquifer recharge (approximately 81-94%) during the later summer is due to the flood irrigation practices applied to these fields.
△ Less
Submitted 8 September, 2022; v1 submitted 4 June, 2022;
originally announced June 2022.
-
Fast and Robust Femur Segmentation from Computed Tomography Images for Patient-Specific Hip Fracture Risk Screening
Authors:
Pall Asgeir Bjornsson,
Alexander Baker,
Ingmar Fleps,
Yves Pauchard,
Halldor Palsson,
Stephen J. Ferguson,
Sigurdur Sigurdsson,
Vilmundur Gudnason,
Benedikt Helgason,
Lotta Maria Ellingsen
Abstract:
Osteoporosis is a common bone disease that increases the risk of bone fracture. Hip-fracture risk screening methods based on finite element analysis depend on segmented computed tomography (CT) images; however, current femur segmentation methods require manual delineations of large data sets. Here we propose a deep neural network for fully automated, accurate, and fast segmentation of the proximal…
▽ More
Osteoporosis is a common bone disease that increases the risk of bone fracture. Hip-fracture risk screening methods based on finite element analysis depend on segmented computed tomography (CT) images; however, current femur segmentation methods require manual delineations of large data sets. Here we propose a deep neural network for fully automated, accurate, and fast segmentation of the proximal femur from CT. Evaluation on a set of 1147 proximal femurs with ground truth segmentations demonstrates that our method is apt for hip-fracture risk screening, bringing us one step closer to a clinically viable option for screening at-risk patients for hip-fracture susceptibility.
△ Less
Submitted 20 April, 2022;
originally announced April 2022.
-
SiMCa: Sinkhorn Matrix Factorization with Capacity Constraints
Authors:
Eric Daoud,
Luca Ganassali,
Antoine Baker,
Marc Lelarge
Abstract:
For a very broad range of problems, recommendation algorithms have been increasingly used over the past decade. In most of these algorithms, the predictions are built upon user-item affinity scores which are obtained from high-dimensional embeddings of items and users. In more complex scenarios, with geometrical or capacity constraints, prediction based on embeddings may not be sufficient and some…
▽ More
For a very broad range of problems, recommendation algorithms have been increasingly used over the past decade. In most of these algorithms, the predictions are built upon user-item affinity scores which are obtained from high-dimensional embeddings of items and users. In more complex scenarios, with geometrical or capacity constraints, prediction based on embeddings may not be sufficient and some additional features should be considered in the design of the algorithm. In this work, we study the recommendation problem in the setting where affinities between users and items are based both on their embeddings in a latent space and on their geographical distance in their underlying euclidean space (e.g., $\mathbb{R}^2$), together with item capacity constraints. This framework is motivated by some real-world applications, for instance in healthcare: the task is to recommend hospitals to patients based on their location, pathology, and hospital capacities. In these applications, there is somewhat of an asymmetry between users and items: items are viewed as static points, their embeddings, capacities and locations constraining the allocation. Upon the observation of an optimal allocation, user embeddings, items capacities, and their positions in their underlying euclidean space, our aim is to recover item embeddings in the latent space; doing so, we are then able to use this estimate e.g. in order to predict future allocations. We propose an algorithm (SiMCa) based on matrix factorization enhanced with optimal transport steps to model user-item affinities and learn item embeddings from observed data. We then illustrate and discuss the results of such an approach for hospital recommendation on synthetic data.
△ Less
Submitted 18 March, 2022;
originally announced March 2022.
-
DSSIM: a structural similarity index for floating-point data
Authors:
Allison H. Baker,
Alexander Pinard,
Dorit M. Hammerling
Abstract:
Data visualization is a critical component in terms of interacting with floating-point output data from large model simulation codes. Indeed, postprocessing analysis workflows on simulation data often generate a large number of images from the raw data, many of which are then compared to each other or to specified reference images. In this image-comparison scenario, image quality assessment (IQA)…
▽ More
Data visualization is a critical component in terms of interacting with floating-point output data from large model simulation codes. Indeed, postprocessing analysis workflows on simulation data often generate a large number of images from the raw data, many of which are then compared to each other or to specified reference images. In this image-comparison scenario, image quality assessment (IQA) measures are quite useful, and the Structural Similarity Index (SSIM) continues to be a popular choice. However, generating large numbers of images can be costly, and plot-specific (but data independent) choices can affect the SSIM value. A natural question is whether we can apply the SSIM directly to the floating-point simulation data and obtain an indication of whether differences in the data are likely to impact a visual assessment, effectively bypassing the creation of a specific set of images from the data. To this end, we propose an alternative to the popular SSIM that can be applied directly to the floating point data, which we refer to as the Data SSIM (DSSIM). While we demonstrate the usefulness of the DSSIM in the context of evaluating differences due to lossy compression on large volumes of simulation data from a popular climate model, the DSSIM may prove useful for many other applications involving simulation or image data.
△ Less
Submitted 19 March, 2023; v1 submitted 5 February, 2022;
originally announced February 2022.
-
VTGAN: Semi-supervised Retinal Image Synthesis and Disease Prediction using Vision Transformers
Authors:
Sharif Amit Kamran,
Khondker Fariha Hossain,
Alireza Tavakkoli,
Stewart Lee Zuckerbrod,
Salah A. Baker
Abstract:
In Fluorescein Angiography (FA), an exogenous dye is injected in the bloodstream to image the vascular structure of the retina. The injected dye can cause adverse reactions such as nausea, vomiting, anaphylactic shock, and even death. In contrast, color fundus imaging is a non-invasive technique used for photographing the retina but does not have sufficient fidelity for capturing its vascular stru…
▽ More
In Fluorescein Angiography (FA), an exogenous dye is injected in the bloodstream to image the vascular structure of the retina. The injected dye can cause adverse reactions such as nausea, vomiting, anaphylactic shock, and even death. In contrast, color fundus imaging is a non-invasive technique used for photographing the retina but does not have sufficient fidelity for capturing its vascular structure. The only non-invasive method for capturing retinal vasculature is optical coherence tomography-angiography (OCTA). However, OCTA equipment is quite expensive, and stable imaging is limited to small areas on the retina. In this paper, we propose a novel conditional generative adversarial network (GAN) capable of simultaneously synthesizing FA images from fundus photographs while predicting retinal degeneration. The proposed system has the benefit of addressing the problem of imaging retinal vasculature in a non-invasive manner as well as predicting the existence of retinal abnormalities. We use a semi-supervised approach to train our GAN using multiple weighted losses on different modalities of data. Our experiments validate that the proposed architecture exceeds recent state-of-the-art generative networks for fundus-to-angiography synthesis. Moreover, our vision transformer-based discriminators generalize quite well on out-of-distribution data sets for retinal disease prediction.
△ Less
Submitted 13 August, 2021; v1 submitted 14 April, 2021;
originally announced April 2021.
-
RV-GAN: Segmenting Retinal Vascular Structure in Fundus Photographs using a Novel Multi-scale Generative Adversarial Network
Authors:
Sharif Amit Kamran,
Khondker Fariha Hossain,
Alireza Tavakkoli,
Stewart Lee Zuckerbrod,
Kenton M. Sanders,
Salah A. Baker
Abstract:
High fidelity segmentation of both macro and microvascular structure of the retina plays a pivotal role in determining degenerative retinal diseases, yet it is a difficult problem. Due to successive resolution loss in the encoding phase combined with the inability to recover this lost information in the decoding phase, autoencoding based segmentation approaches are limited in their ability to extr…
▽ More
High fidelity segmentation of both macro and microvascular structure of the retina plays a pivotal role in determining degenerative retinal diseases, yet it is a difficult problem. Due to successive resolution loss in the encoding phase combined with the inability to recover this lost information in the decoding phase, autoencoding based segmentation approaches are limited in their ability to extract retinal microvascular structure. We propose RV-GAN, a new multi-scale generative architecture for accurate retinal vessel segmentation to alleviate this. The proposed architecture uses two generators and two multi-scale autoencoding discriminators for better microvessel localization and segmentation. In order to avoid the loss of fidelity suffered by traditional GAN-based segmentation systems, we introduce a novel weighted feature matching loss. This new loss incorporates and prioritizes features from the discriminator's decoder over the encoder. Doing so combined with the fact that the discriminator's decoder attempts to determine real or fake images at the pixel level better preserves macro and microvascular structure. By combining reconstruction and weighted feature matching loss, the proposed architecture achieves an area under the curve (AUC) of 0.9887, 0.9914, and 0.9887 in pixel-wise segmentation of retinal vasculature from three publicly available datasets, namely DRIVE, CHASE-DB1, and STARE, respectively. Additionally, RV-GAN outperforms other architectures in two additional relevant metrics, mean intersection-over-union (Mean-IOU) and structural similarity measure (SSIM).
△ Less
Submitted 14 May, 2021; v1 submitted 2 January, 2021;
originally announced January 2021.
-
Epidemic mitigation by statistical inference from contact tracing data
Authors:
Antoine Baker,
Indaco Biazzo,
Alfredo Braunstein,
Giovanni Catania,
Luca Dall'Asta,
Alessandro Ingrosso,
Florent Krzakala,
Fabio Mazza,
Marc Mézard,
Anna Paola Muntoni,
Maria Refinetti,
Stefano Sarao Mannelli,
Lenka Zdeborová
Abstract:
Contact-tracing is an essential tool in order to mitigate the impact of pandemic such as the COVID-19. In order to achieve efficient and scalable contact-tracing in real time, digital devices can play an important role. While a lot of attention has been paid to analyzing the privacy and ethical risks of the associated mobile applications, so far much less research has been devoted to optimizing th…
▽ More
Contact-tracing is an essential tool in order to mitigate the impact of pandemic such as the COVID-19. In order to achieve efficient and scalable contact-tracing in real time, digital devices can play an important role. While a lot of attention has been paid to analyzing the privacy and ethical risks of the associated mobile applications, so far much less research has been devoted to optimizing their performance and assessing their impact on the mitigation of the epidemic. We develop Bayesian inference methods to estimate the risk that an individual is infected. This inference is based on the list of his recent contacts and their own risk levels, as well as personal information such as results of tests or presence of syndromes. We propose to use probabilistic risk estimation in order to optimize testing and quarantining strategies for the control of an epidemic. Our results show that in some range of epidemic spreading (typically when the manual tracing of all contacts of infected people becomes practically impossible, but before the fraction of infected people reaches the scale where a lock-down becomes unavoidable), this inference of individuals at risk could be an efficient way to mitigate the epidemic. Our approaches translate into fully distributed algorithms that only require communication between individuals who have recently been in contact. Such communication may be encrypted and anonymized and thus compatible with privacy preserving standards. We conclude that probabilistic risk estimation is capable to enhance performance of digital contact tracing and should be considered in the currently developed mobile applications.
△ Less
Submitted 20 September, 2020;
originally announced September 2020.
-
Fundus2Angio: A Conditional GAN Architecture for Generating Fluorescein Angiography Images from Retinal Fundus Photography
Authors:
Sharif Amit Kamran,
Khondker Fariha Hossain,
Alireza Tavakkoli,
Stewart Lee Zuckerbrod,
Salah A. Baker,
Kenton M. Sanders
Abstract:
Carrying out clinical diagnosis of retinal vascular degeneration using Fluorescein Angiography (FA) is a time consuming process and can pose significant adverse effects on the patient. Angiography requires insertion of a dye that may cause severe adverse effects and can even be fatal. Currently, there are no non-invasive systems capable of generating Fluorescein Angiography images. However, retina…
▽ More
Carrying out clinical diagnosis of retinal vascular degeneration using Fluorescein Angiography (FA) is a time consuming process and can pose significant adverse effects on the patient. Angiography requires insertion of a dye that may cause severe adverse effects and can even be fatal. Currently, there are no non-invasive systems capable of generating Fluorescein Angiography images. However, retinal fundus photography is a non-invasive imaging technique that can be completed in a few seconds. In order to eliminate the need for FA, we propose a conditional generative adversarial network (GAN) to translate fundus images to FA images. The proposed GAN consists of a novel residual block capable of generating high quality FA images. These images are important tools in the differential diagnosis of retinal diseases without the need for invasive procedure with possible side effects. Our experiments show that the proposed architecture outperforms other state-of-the-art generative networks. Furthermore, our proposed model achieves better qualitative results indistinguishable from real angiograms.
△ Less
Submitted 29 September, 2020; v1 submitted 11 May, 2020;
originally announced May 2020.
-
Tree-AMP: Compositional Inference with Tree Approximate Message Passing
Authors:
Antoine Baker,
Benjamin Aubin,
Florent Krzakala,
Lenka Zdeborová
Abstract:
We introduce Tree-AMP, standing for Tree Approximate Message Passing, a python package for compositional inference in high-dimensional tree-structured models. The package provides a unifying framework to study several approximate message passing algorithms previously derived for a variety of machine learning tasks such as generalized linear models, inference in multi-layer networks, matrix factori…
▽ More
We introduce Tree-AMP, standing for Tree Approximate Message Passing, a python package for compositional inference in high-dimensional tree-structured models. The package provides a unifying framework to study several approximate message passing algorithms previously derived for a variety of machine learning tasks such as generalized linear models, inference in multi-layer networks, matrix factorization, and reconstruction using non-separable penalties. For some models, the asymptotic performance of the algorithm can be theoretically predicted by the state evolution, and the measurements entropy estimated by the free entropy formalism. The implementation is modular by design: each module, which implements a factor, can be composed at will with other modules to solve complex inference tasks. The user only needs to declare the factor graph of the model: the inference algorithm, state evolution and entropy estimation are fully automated.
△ Less
Submitted 11 December, 2021; v1 submitted 3 April, 2020;
originally announced April 2020.
-
Learning medical triage from clinicians using Deep Q-Learning
Authors:
Albert Buchard,
Baptiste Bouvier,
Giulia Prando,
Rory Beard,
Michail Livieratos,
Dan Busbridge,
Daniel Thompson,
Jonathan Richens,
Yuanzhao Zhang,
Adam Baker,
Yura Perov,
Kostis Gourgoulias,
Saurabh Johri
Abstract:
Medical Triage is of paramount importance to healthcare systems, allowing for the correct orientation of patients and allocation of the necessary resources to treat them adequately. While reliable decision-tree methods exist to triage patients based on their presentation, those trees implicitly require human inference and are not immediately applicable in a fully automated setting. On the other ha…
▽ More
Medical Triage is of paramount importance to healthcare systems, allowing for the correct orientation of patients and allocation of the necessary resources to treat them adequately. While reliable decision-tree methods exist to triage patients based on their presentation, those trees implicitly require human inference and are not immediately applicable in a fully automated setting. On the other hand, learning triage policies directly from experts may correct for some of the limitations of hard-coded decision-trees. In this work, we present a Deep Reinforcement Learning approach (a variant of DeepQ-Learning) to triage patients using curated clinical vignettes. The dataset, consisting of 1374 clinical vignettes, was created by medical doctors to represent real-life cases. Each vignette is associated with an average of 3.8 expert triage decisions given by medical doctors relying solely on medical history. We show that this approach is on a par with human performance, yielding safe triage decisions in 94% of cases, and matching expert decisions in 85% of cases. The trained agent learns when to stop asking questions, acquires optimized decision policies requiring less evidence than supervised approaches, and adapts to the novelty of a situation by asking for more information. Overall, we demonstrate that a Deep Reinforcement Learning approach can learn effective medical triage policies directly from expert decisions, without requiring expert knowledge engineering. This approach is scalable and can be deployed in healthcare settings or geographical regions with distinct triage specifications, or where trained experts are scarce, to improve decision making in the early stage of care.
△ Less
Submitted 24 June, 2020; v1 submitted 28 March, 2020;
originally announced March 2020.
-
Exact asymptotics for phase retrieval and compressed sensing with random generative priors
Authors:
Benjamin Aubin,
Bruno Loureiro,
Antoine Baker,
Florent Krzakala,
Lenka Zdeborová
Abstract:
We consider the problem of compressed sensing and of (real-valued) phase retrieval with random measurement matrix. We derive sharp asymptotics for the information-theoretically optimal performance and for the best known polynomial algorithm for an ensemble of generative priors consisting of fully connected deep neural networks with random weight matrices and arbitrary activations. We compare the p…
▽ More
We consider the problem of compressed sensing and of (real-valued) phase retrieval with random measurement matrix. We derive sharp asymptotics for the information-theoretically optimal performance and for the best known polynomial algorithm for an ensemble of generative priors consisting of fully connected deep neural networks with random weight matrices and arbitrary activations. We compare the performance to sparse separable priors and conclude that generative priors might be advantageous in terms of algorithmic performance. In particular, while sparsity does not allow to perform compressive phase retrieval efficiently close to its information-theoretic limit, it is found that under the random generative prior compressed phase retrieval becomes tractable.
△ Less
Submitted 12 June, 2020; v1 submitted 4 December, 2019;
originally announced December 2019.
-
MultiVerse: Causal Reasoning using Importance Sampling in Probabilistic Programming
Authors:
Yura Perov,
Logan Graham,
Kostis Gourgoulias,
Jonathan G. Richens,
Ciarán M. Lee,
Adam Baker,
Saurabh Johri
Abstract:
We elaborate on using importance sampling for causal reasoning, in particular for counterfactual inference. We show how this can be implemented natively in probabilistic programming. By considering the structure of the counterfactual query, one can significantly optimise the inference process. We also consider design choices to enable further optimisations. We introduce MultiVerse, a probabilistic…
▽ More
We elaborate on using importance sampling for causal reasoning, in particular for counterfactual inference. We show how this can be implemented natively in probabilistic programming. By considering the structure of the counterfactual query, one can significantly optimise the inference process. We also consider design choices to enable further optimisations. We introduce MultiVerse, a probabilistic programming prototype engine for approximate causal reasoning. We provide experimental results and compare with Pyro, an existing probabilistic programming framework with some of causal reasoning tools.
△ Less
Submitted 28 January, 2020; v1 submitted 17 October, 2019;
originally announced October 2019.
-
Universal Marginaliser for Deep Amortised Inference for Probabilistic Programs
Authors:
Robert Walecki,
Kostis Gourgoulias,
Adam Baker,
Chris Hart,
Chris Lucas,
Max Zwiessele,
Albert Buchard,
Maria Lomeli,
Yura Perov,
Saurabh Johri
Abstract:
Probabilistic programming languages (PPLs) are powerful modelling tools which allow to formalise our knowledge about the world and reason about its inherent uncertainty. Inference methods used in PPL can be computationally costly due to significant time burden and/or storage requirements; or they can lack theoretical guarantees of convergence and accuracy when applied to large scale graphical mode…
▽ More
Probabilistic programming languages (PPLs) are powerful modelling tools which allow to formalise our knowledge about the world and reason about its inherent uncertainty. Inference methods used in PPL can be computationally costly due to significant time burden and/or storage requirements; or they can lack theoretical guarantees of convergence and accuracy when applied to large scale graphical models. To this end, we present the Universal Marginaliser (UM), a novel method for amortised inference, in PPL. We show how combining samples drawn from the original probabilistic program prior with an appropriate augmentation method allows us to train one neural network to approximate any of the corresponding conditional marginal distributions, with any separation into latent and observed variables, and thus amortise the cost of inference. Finally, we benchmark the method on multiple probabilistic programs, in Pyro, with different model structure.
△ Less
Submitted 16 October, 2019;
originally announced October 2019.
-
On the Universality of Noiseless Linear Estimation with Respect to the Measurement Matrix
Authors:
Alia Abbara,
Antoine Baker,
Florent Krzakala,
Lenka Zdeborová
Abstract:
In a noiseless linear estimation problem, one aims to reconstruct a vector x* from the knowledge of its linear projections y=Phi x*. There have been many theoretical works concentrating on the case where the matrix Phi is a random i.i.d. one, but a number of heuristic evidence suggests that many of these results are universal and extend well beyond this restricted case. Here we revisit this proble…
▽ More
In a noiseless linear estimation problem, one aims to reconstruct a vector x* from the knowledge of its linear projections y=Phi x*. There have been many theoretical works concentrating on the case where the matrix Phi is a random i.i.d. one, but a number of heuristic evidence suggests that many of these results are universal and extend well beyond this restricted case. Here we revisit this problematic through the prism of development of message passing methods, and consider not only the universality of the l1 transition, as previously addressed, but also the one of the optimal Bayesian reconstruction. We observed that the universality extends to the Bayes-optimal minimum mean-squared (MMSE) error, and to a range of structured matrices.
△ Less
Submitted 11 June, 2019;
originally announced June 2019.
-
Q# and NWChem: Tools for Scalable Quantum Chemistry on Quantum Computers
Authors:
Guang Hao Low,
Nicholas P. Bauman,
Christopher E. Granade,
Bo Peng,
Nathan Wiebe,
Eric J. Bylaska,
Dave Wecker,
Sriram Krishnamoorthy,
Martin Roetteler,
Karol Kowalski,
Matthias Troyer,
Nathan A. Baker
Abstract:
Fault-tolerant quantum computation promises to solve outstanding problems in quantum chemistry within the next decade. Realizing this promise requires scalable tools that allow users to translate descriptions of electronic structure problems to optimized quantum gate sequences executed on physical hardware, without requiring specialized quantum computing knowledge. To this end, we present a quantu…
▽ More
Fault-tolerant quantum computation promises to solve outstanding problems in quantum chemistry within the next decade. Realizing this promise requires scalable tools that allow users to translate descriptions of electronic structure problems to optimized quantum gate sequences executed on physical hardware, without requiring specialized quantum computing knowledge. To this end, we present a quantum chemistry library, under the open-source MIT license, that implements and enables straightforward use of state-of-art quantum simulation algorithms. The library is implemented in Q#, a language designed to express quantum algorithms at scale, and interfaces with NWChem, a leading electronic structure package. We define a standardized schema for this interface, Broombridge, that describes second-quantized Hamiltonians, along with metadata required for effective quantum simulation, such as trial wavefunction ansatzes. This schema is generated for arbitrary molecules by NWChem, conveniently accessible, for instance, through Docker containers and a recently developed web interface EMSL Arrows. We illustrate use of the library with various examples, including ground- and excited-state calculations for LiH, H$_{10}$, and C$_{20}$ with an active-space simplification, and automatically obtain resource estimates for classically intractable examples.
△ Less
Submitted 1 April, 2019;
originally announced April 2019.
-
Making root cause analysis feasible for large code bases: a solution approach for a climate model
Authors:
Daniel J. Milroy,
Allison H. Baker,
Dorit M. Hammerling,
Youngsung Kim,
Elizabeth R. Jessup,
Thomas Hauser
Abstract:
For large-scale simulation codes with huge and complex code bases, where bit-for-bit comparisons are too restrictive, finding the source of statistically significant discrepancies (e.g., from a previous version, alternative hardware or supporting software stack) in output is non-trivial at best. Although there are many tools for program comprehension through debugging or slicing, few (if any) scal…
▽ More
For large-scale simulation codes with huge and complex code bases, where bit-for-bit comparisons are too restrictive, finding the source of statistically significant discrepancies (e.g., from a previous version, alternative hardware or supporting software stack) in output is non-trivial at best. Although there are many tools for program comprehension through debugging or slicing, few (if any) scale to a model as large as the Community Earth System Model (CESM; trademarked), which consists of more than 1.5 million lines of Fortran code. Currently for the CESM, we can easily determine whether a discrepancy exists in the output using a by now well-established statistical consistency testing tool. However, this tool provides no information as to the possible cause of the detected discrepancy, leaving developers in a seemingly impossible (and frustrating) situation. Therefore, our aim in this work is to provide the tools to enable developers to trace a problem detected through the CESM output to its source. To this end, our strategy is to reduce the search space for the root cause(s) to a tractable size via a series of techniques that include creating a directed graph of internal CESM variables, extracting a subgraph (using a form of hybrid program slicing), partitioning into communities, and ranking nodes by centrality. Runtime variable sampling then becomes feasible in this reduced search space. We demonstrate the utility of this process on multiple examples of CESM simulation output by illustrating how sampling can be performed as part of an efficient parallel iterative refinement procedure to locate error sources, including sensitivity to CPU instructions. By providing CESM developers with tools to identify and understand the reason for statistically distinct output, we have positively impacted the CESM software development cycle and, in particular, its focus on quality assurance.
△ Less
Submitted 11 February, 2019; v1 submitted 31 October, 2018;
originally announced October 2018.
-
A comparative study of artificial intelligence and human doctors for the purpose of triage and diagnosis
Authors:
Salman Razzaki,
Adam Baker,
Yura Perov,
Katherine Middleton,
Janie Baxter,
Daniel Mullarkey,
Davinder Sangar,
Michael Taliercio,
Mobasher Butt,
Azeem Majeed,
Arnold DoRosario,
Megan Mahoney,
Saurabh Johri
Abstract:
Online symptom checkers have significant potential to improve patient care, however their reliability and accuracy remain variable. We hypothesised that an artificial intelligence (AI) powered triage and diagnostic system would compare favourably with human doctors with respect to triage and diagnostic accuracy. We performed a prospective validation study of the accuracy and safety of an AI powere…
▽ More
Online symptom checkers have significant potential to improve patient care, however their reliability and accuracy remain variable. We hypothesised that an artificial intelligence (AI) powered triage and diagnostic system would compare favourably with human doctors with respect to triage and diagnostic accuracy. We performed a prospective validation study of the accuracy and safety of an AI powered triage and diagnostic system. Identical cases were evaluated by both an AI system and human doctors. Differential diagnoses and triage outcomes were evaluated by an independent judge, who was blinded from knowing the source (AI system or human doctor) of the outcomes. Independently of these cases, vignettes from publicly available resources were also assessed to provide a benchmark to previous studies and the diagnostic component of the MRCGP exam. Overall we found that the Babylon AI powered Triage and Diagnostic System was able to identify the condition modelled by a clinical vignette with accuracy comparable to human doctors (in terms of precision and recall). In addition, we found that the triage advice recommended by the AI System was, on average, safer than that of human doctors, when compared to the ranges of acceptable triage provided by independent expert judges, with only a minimal reduction in appropriateness.
△ Less
Submitted 27 June, 2018;
originally announced June 2018.
-
A Universal Marginalizer for Amortized Inference in Generative Models
Authors:
Laura Douglas,
Iliyan Zarov,
Konstantinos Gourgoulias,
Chris Lucas,
Chris Hart,
Adam Baker,
Maneesh Sahani,
Yura Perov,
Saurabh Johri
Abstract:
We consider the problem of inference in a causal generative model where the set of available observations differs between data instances. We show how combining samples drawn from the graphical model with an appropriate masking function makes it possible to train a single neural network to approximate all the corresponding conditional marginal distributions and thus amortize the cost of inference.…
▽ More
We consider the problem of inference in a causal generative model where the set of available observations differs between data instances. We show how combining samples drawn from the graphical model with an appropriate masking function makes it possible to train a single neural network to approximate all the corresponding conditional marginal distributions and thus amortize the cost of inference. We further demonstrate that the efficiency of importance sampling may be improved by basing proposals on the output of the neural network. We also outline how the same network can be used to generate samples from an approximate joint posterior via a chain decomposition of the graph.
△ Less
Submitted 2 November, 2017;
originally announced November 2017.
-
A Note on Sparsification by Frames
Authors:
Christopher A. Baker
Abstract:
The purpose of this note is to establish a new generalized Dictionary-Restricted Isometry Property (D-RIP) sparsity bound constant for compressed sensing.
For fulfilling D-RIP, the constant $δ_k$ is used in the definition: $(1 -δ_k)\|D v\|_2^2 \le \|ΦD v\|_2^2 \le (1 + δ_k)\|D v\|^2$. We prove that signals with $k$-sparse $D$-representation can be reconstructed if $δ_{2k} < \frac{2}3$.
The app…
▽ More
The purpose of this note is to establish a new generalized Dictionary-Restricted Isometry Property (D-RIP) sparsity bound constant for compressed sensing.
For fulfilling D-RIP, the constant $δ_k$ is used in the definition: $(1 -δ_k)\|D v\|_2^2 \le \|ΦD v\|_2^2 \le (1 + δ_k)\|D v\|^2$. We prove that signals with $k$-sparse $D$-representation can be reconstructed if $δ_{2k} < \frac{2}3$.
The approach in this note can be extended to obtain other D-RIP bounds (i.e., $δ_{tk}$).
△ Less
Submitted 19 December, 2014; v1 submitted 23 August, 2013;
originally announced August 2013.
-
Hodge Theory on Metric Spaces
Authors:
Laurent Bartholdi,
Thomas Schick,
Nat Smale,
Steve Smale,
Anthony W. Baker
Abstract:
Hodge theory is a beautiful synthesis of geometry, topology, and analysis, which has been developed in the setting of Riemannian manifolds. On the other hand, spaces of images, which are important in the mathematical foundations of vision and pattern recognition, do not fit this framework. This motivates us to develop a version of Hodge theory on metric spaces with a probability measure. We believ…
▽ More
Hodge theory is a beautiful synthesis of geometry, topology, and analysis, which has been developed in the setting of Riemannian manifolds. On the other hand, spaces of images, which are important in the mathematical foundations of vision and pattern recognition, do not fit this framework. This motivates us to develop a version of Hodge theory on metric spaces with a probability measure. We believe that this constitutes a step towards understanding the geometry of vision.
The appendix by Anthony Baker provides a separable, compact metric space with infinite dimensional α-scale homology.
△ Less
Submitted 24 November, 2011; v1 submitted 1 December, 2009;
originally announced December 2009.