Skip to main content

Showing 1–13 of 13 results for author: Tirunagari, S

.
  1. arXiv:2505.12534  [pdf, other

    cs.LG

    ChemPile: A 250GB Diverse and Curated Dataset for Chemical Foundation Models

    Authors: Adrian Mirza, Nawaf Alampara, Martiño Ríos-García, Mohamed Abdelalim, Jack Butler, Bethany Connolly, Tunca Dogan, Marianna Nezhurina, Bünyamin Şen, Santosh Tirunagari, Mark Worrall, Adamo Young, Philippe Schwaller, Michael Pieler, Kevin Maik Jablonka

    Abstract: Foundation models have shown remarkable success across scientific domains, yet their impact in chemistry remains limited due to the absence of diverse, large-scale, high-quality datasets that reflect the field's multifaceted nature. We present the ChemPile, an open dataset containing over 75 billion tokens of curated chemical data, specifically built for training and evaluating general-purpose mod… ▽ More

    Submitted 18 May, 2025; originally announced May 2025.

  2. arXiv:2401.01414  [pdf, other

    eess.IV cs.LG

    VALD-MD: Visual Attribution via Latent Diffusion for Medical Diagnostics

    Authors: Ammar A. Siddiqui, Santosh Tirunagari, Tehseen Zia, David Windridge

    Abstract: Visual attribution in medical imaging seeks to make evident the diagnostically-relevant components of a medical image, in contrast to the more common detection of diseased tissue deployed in standard machine vision pipelines (which are less straightforwardly interpretable/explainable to clinicians). We here present a novel generative visual attribution technique, one that leverages latent diffusio… ▽ More

    Submitted 2 January, 2024; originally announced January 2024.

  3. arXiv:1905.11387  [pdf, other

    eess.IV cs.CV

    Automatic Delineation of Kidney Region in DCE-MRI

    Authors: Santosh Tirunagari, Norman Poh, Kevin Wells, Miroslaw Bober, Isky Gorden, David Windridge

    Abstract: Delineation of the kidney region in dynamic contrast-enhanced magnetic resonance Imaging (DCE-MRI) is required during post-acquisition analysis in order to quantify various aspects of renal function, such as filtration and perfusion or blood flow. However, this can be obfuscated by the Partial Volume Effect (PVE), caused due to the mixing of any single voxel with two or more signal intensities fro… ▽ More

    Submitted 26 May, 2019; originally announced May 2019.

    Comments: arXiv admin note: text overlap with arXiv:1905.10218

  4. arXiv:1905.10218  [pdf, other

    eess.IV cs.CV

    Functional Segmentation through Dynamic Mode Decomposition: Automatic Quantification of Kidney Function in DCE-MRI Images

    Authors: Santosh Tirunagari, Norman Poh, Kevin Wells, Miroslaw Bober, Isky Gorden, David Windridge

    Abstract: Quantification of kidney function in Dynamic Contrast-Enhanced Magnetic Resonance Imaging (DCE-MRI) requires careful segmentation of the renal region of interest (ROI). Traditionally, human experts are required to manually delineate the kidney ROI across multiple images in the dynamic sequence. This approach is costly, time-consuming and labour intensive, and therefore acts to limit patient throug… ▽ More

    Submitted 24 May, 2019; originally announced May 2019.

  5. arXiv:1612.01409  [pdf, other

    q-bio.QM stat.AP

    Probabilistic Broken-Stick Model: A Regression Algorithm for Irregularly Sampled Data with Application to eGFR

    Authors: Norman Poh, Simon Bull, Santosh Tirunagari, Nicholas Cole, Simon de Lusignan

    Abstract: In order for clinicians to manage disease progression and make effective decisions about drug dosage, treatment regimens or scheduling follow up appointments, it is necessary to be able to identify both short and long-term trends in repeated biomedical measurements. However, this is complicated by the fact that these measurements are irregularly sampled and influenced by both genuine physiological… ▽ More

    Submitted 30 November, 2016; originally announced December 2016.

    Comments: Preprint submitted to Journal of Biomedical Informatics

  6. arXiv:1609.05716  [pdf, other

    q-bio.QM cs.NE

    Visualisation of Survey Responses using Self-Organising Maps: A Case Study on Diabetes Self-care Factors

    Authors: Santosh Tirunagari, Simon Bull, Samaneh Kouchaki, Deborah Cooke, Norman Poh

    Abstract: Due to the chronic nature of diabetes, patient self-care factors play an important role in any treatment plan. In order to understand the behaviour of patients in response to medical advice on self-care, clinicians often conduct cross-sectional surveys. When analysing the survey data, statistical machine learning methods can potentially provide additional insight into the data either through deepe… ▽ More

    Submitted 30 August, 2016; originally announced September 2016.

  7. arXiv:1609.04214  [pdf, ps, other

    cs.CR cs.AI cs.NI

    "Flow Size Difference" Can Make a Difference: Detecting Malicious TCP Network Flows Based on Benford's Law

    Authors: Aamo Iorliam, Santosh Tirunagari, Anthony T. S. Ho, Shujun Li, Adrian Waller, Norman Poh

    Abstract: Statistical characteristics of network traffic have attracted a significant amount of research for automated network intrusion detection, some of which looked at applications of natural statistical laws such as Zipf's law, Benford's law and the Pareto distribution. In this paper, we present the application of Benford's law to a new network flow metric "flow size difference", which have not been st… ▽ More

    Submitted 20 January, 2017; v1 submitted 14 September, 2016; originally announced September 2016.

    Comments: 13 pages, 3 figures

    ACM Class: C.2; K.6.5

  8. arXiv:1607.06783  [pdf

    cs.CV

    Can DMD obtain a Scene Background in Color?

    Authors: Santosh Tirunagari, Norman Poh, Miroslaw Bober, David Windridge

    Abstract: A background model describes a scene without any foreground objects and has a number of applications, ranging from video surveillance to computational photography. Recent studies have introduced the method of Dynamic Mode Decomposition (DMD) for robustly separating video frames into a background model and foreground components. While the method introduced operates by converting color images to gra… ▽ More

    Submitted 22 July, 2016; originally announced July 2016.

    Comments: International Conference on Image, Vision and Computing (ICIVC 2016), August 3-5, 2016, Portsmouth, UK

  9. arXiv:1605.05142  [pdf, other

    cs.LG cs.CE

    Automatic Classification of Irregularly Sampled Time Series with Unequal Lengths: A Case Study on Estimated Glomerular Filtration Rate

    Authors: Santosh Tirunagari, Simon Bull, Norman Poh

    Abstract: A patient's estimated glomerular filtration rate (eGFR) can provide important information about disease progression and kidney function. Traditionally, an eGFR time series is interpreted by a human expert labelling it as stable or unstable. While this approach works for individual patients, the time consuming nature of it precludes the quick evaluation of risk in large numbers of patients. However… ▽ More

    Submitted 17 May, 2016; originally announced May 2016.

    Report number: CS-CKD-2016-01

  10. arXiv:1507.02447  [pdf, other

    cs.IR cs.CL

    Data Mining of Causal Relations from Text: Analysing Maritime Accident Investigation Reports

    Authors: Santosh Tirunagari

    Abstract: Text mining is a process of extracting information of interest from text. Such a method includes techniques from various areas such as Information Retrieval (IR), Natural Language Processing (NLP), and Information Extraction (IE). In this study, text mining methods are applied to extract causal relations from maritime accident investigation reports collected from the Marine Accident Investigation… ▽ More

    Submitted 9 July, 2015; originally announced July 2015.

  11. arXiv:1503.06331  [pdf, other

    cs.CE physics.flu-dyn

    Exploratory Data Analysis of The KelvinHelmholtz instability in Jets

    Authors: Santosh Tirunagari

    Abstract: The KelvinHelmholtz (KH) instability is a fundamental wave instability that is frequently observed in all kinds of shear layer (jets, wakes, atmospheric air currents etc). The study of KH-instability, coherent flow structures has a major impact in understanding the fundamentals of fluid dynamics. Therefore there is a need for methods that can identify and analyse these structures. In this Final as… ▽ More

    Submitted 21 March, 2015; originally announced March 2015.

    Report number: DNS-Report-2012

  12. arXiv:1503.06316  [pdf, other

    cs.CE cs.AI

    Identifying Similar Patients Using Self-Organising Maps: A Case Study on Type-1 Diabetes Self-care Survey Responses

    Authors: Santosh Tirunagari, Norman Poh, Guosheng Hu, David Windridge

    Abstract: Diabetes is considered a lifestyle disease and a well managed self-care plays an important role in the treatment. Clinicians often conduct surveys to understand the self-care behaviors in their patients. In this context, we propose to use Self-Organising Maps (SOM) to explore the survey data for assessing the self-care behaviors in Type-1 diabetic patients. Specifically, SOM is used to visualize h… ▽ More

    Submitted 21 March, 2015; originally announced March 2015.

    Comments: 01-05 pages

    Report number: TR-DoC-02

  13. arXiv:1503.03680  [pdf, other

    q-bio.QM

    Breast Cancer Data Analytics With Missing Values: A study on Ethnic, Age and Income Groups

    Authors: Santosh Tirunagari, Norman Poh, Hajara Abdulrahman, Nawal Nemmour, David Windridge

    Abstract: An analysis of breast cancer incidences in women and the relationship between ethnicity and survival rate has been an ongoing study with recorded incidences of missing values in the secondary data. In this paper, we study and report the results of breast cancer survival rate by ethnicity, age and income groups from the dataset collected for 53593 patients in South East England between the years 19… ▽ More

    Submitted 12 March, 2015; originally announced March 2015.

    Comments: The paper analyzes a breast cancer data with missing values, where the missing values of ethnicity are imputed based on a Naive Bayes classifier. Further, the data was analysed from domain purpose as well such as the effect of ethnicity, age, and income on the survival of the breast cancer