Search | arXiv e-print repository

arXiv:2507.00983 [pdf, ps, other]

DMCIE: Diffusion Model with Concatenation of Inputs and Errors to Improve the Accuracy of the Segmentation of Brain Tumors in MRI Images

Authors: Sara Yavari, Rahul Nitin Pandya, Jacob Furst

Abstract: Accurate segmentation of brain tumors in MRI scans is essential for reliable clinical diagnosis and effective treatment planning. Recently, diffusion models have demonstrated remarkable effectiveness in image generation and segmentation tasks. This paper introduces a novel approach to corrective segmentation based on diffusion models. We propose DMCIE (Diffusion Model with Concatenation of Inputs… ▽ More Accurate segmentation of brain tumors in MRI scans is essential for reliable clinical diagnosis and effective treatment planning. Recently, diffusion models have demonstrated remarkable effectiveness in image generation and segmentation tasks. This paper introduces a novel approach to corrective segmentation based on diffusion models. We propose DMCIE (Diffusion Model with Concatenation of Inputs and Errors), a novel framework for accurate brain tumor segmentation in multi-modal MRI scans. We employ a 3D U-Net to generate an initial segmentation mask, from which an error map is generated by identifying the differences between the prediction and the ground truth. The error map, concatenated with the original MRI images, are used to guide a diffusion model. Using multimodal MRI inputs (T1, T1ce, T2, FLAIR), DMCIE effectively enhances segmentation accuracy by focusing on misclassified regions, guided by the original inputs. Evaluated on the BraTS2020 dataset, DMCIE outperforms several state-of-the-art diffusion-based segmentation methods, achieving a Dice Score of 93.46 and an HD95 of 5.94 mm. These results highlight the effectiveness of error-guided diffusion in producing precise and reliable brain tumor segmentations. △ Less

Submitted 1 July, 2025; originally announced July 2025.

arXiv:2505.07917 [pdf, other]

Efficient and Reproducible Biomedical Question Answering using Retrieval Augmented Generation

Authors: Linus Stuhlmann, Michael Alexander Saxer, Jonathan Fürst

Abstract: Biomedical question-answering (QA) systems require effective retrieval and generation components to ensure accuracy, efficiency, and scalability. This study systematically examines a Retrieval-Augmented Generation (RAG) system for biomedical QA, evaluating retrieval strategies and response time trade-offs. We first assess state-of-the-art retrieval methods, including BM25, BioBERT, MedCPT, and a h… ▽ More Biomedical question-answering (QA) systems require effective retrieval and generation components to ensure accuracy, efficiency, and scalability. This study systematically examines a Retrieval-Augmented Generation (RAG) system for biomedical QA, evaluating retrieval strategies and response time trade-offs. We first assess state-of-the-art retrieval methods, including BM25, BioBERT, MedCPT, and a hybrid approach, alongside common data stores such as Elasticsearch, MongoDB, and FAISS, on a ~10% subset of PubMed (2.4M documents) to measure indexing efficiency, retrieval latency, and retriever performance in the end-to-end RAG system. Based on these insights, we deploy the final RAG system on the full 24M PubMed corpus, comparing different retrievers' impact on overall performance. Evaluations of the retrieval depth show that retrieving 50 documents with BM25 before reranking with MedCPT optimally balances accuracy (0.90), recall (0.90), and response time (1.91s). BM25 retrieval time remains stable (82ms), while MedCPT incurs the main computational cost. These results highlight previously not well-known trade-offs in retrieval depth, efficiency, and scalability for biomedical QA. With open-source code, the system is fully reproducible and extensible. △ Less

Submitted 12 May, 2025; originally announced May 2025.

Comments: Accepted at SDS25

arXiv:2504.20033 [pdf, other]

Mitigating Catastrophic Forgetting in the Incremental Learning of Medical Images

Authors: Sara Yavari, Jacob Furst

Abstract: This paper proposes an Incremental Learning (IL) approach to enhance the accuracy and efficiency of deep learning models in analyzing T2-weighted (T2w) MRI medical images prostate cancer detection using the PI-CAI dataset. We used multiple health centers' artificial intelligence and radiology data, focused on different tasks that looked at prostate cancer detection using MRI (PI-CAI). We utilized… ▽ More This paper proposes an Incremental Learning (IL) approach to enhance the accuracy and efficiency of deep learning models in analyzing T2-weighted (T2w) MRI medical images prostate cancer detection using the PI-CAI dataset. We used multiple health centers' artificial intelligence and radiology data, focused on different tasks that looked at prostate cancer detection using MRI (PI-CAI). We utilized Knowledge Distillation (KD), as it employs generated images from past tasks to guide the training of models for subsequent tasks. The approach yielded improved performance and faster convergence of the models. To demonstrate the versatility and robustness of our approach, we evaluated it on the PI-CAI dataset, a diverse set of medical imaging modalities including OCT and PathMNIST, and the benchmark continual learning dataset CIFAR-10. Our results indicate that KD can be a promising technique for IL in medical image analysis in which data is sourced from individual health centers and the storage of large datasets is not feasible. By using generated images from prior tasks, our method enables the model to retain and apply previously acquired knowledge without direct access to the original data. △ Less

Submitted 28 April, 2025; originally announced April 2025.

Comments: 15 Pages, 3 Figures, 3 Tables, 1 Algorithm, This paper will be updated

ACM Class: I.2.6; I.2.10

arXiv:2502.18179 [pdf, other]

Problem Solved? Information Extraction Design Space for Layout-Rich Documents using LLMs

Authors: Gaye Colakoglu, Gürkan Solmaz, Jonathan Fürst

Abstract: This paper defines and explores the design space for information extraction (IE) from layout-rich documents using large language models (LLMs). The three core challenges of layout-aware IE with LLMs are 1) data structuring, 2) model engagement, and 3) output refinement. Our study delves into the sub-problems within these core challenges, such as input representation, chunking, prompting, and selec… ▽ More This paper defines and explores the design space for information extraction (IE) from layout-rich documents using large language models (LLMs). The three core challenges of layout-aware IE with LLMs are 1) data structuring, 2) model engagement, and 3) output refinement. Our study delves into the sub-problems within these core challenges, such as input representation, chunking, prompting, and selection of LLMs and multimodal models. It examines the outcomes of different design choices through a new layout-aware IE test suite, benchmarking against the state-of-art (SoA) model LayoutLMv3. The results show that the configuration from one-factor-at-a-time (OFAT) trial achieves near-optimal results with 14.1 points F1-score gain from the baseline model, while full factorial exploration yields only a slightly higher 15.1 points gain at around 36x greater token usage. We demonstrate that well-configured general-purpose LLMs can match the performance of specialized models, providing a cost-effective alternative. Our test-suite is freely available at https://github.com/gayecolakoglu/LayIE-LLM. △ Less

Submitted 25 February, 2025; originally announced February 2025.

arXiv:2412.18428 [pdf, other]

Explainable Multi-Modal Data Exploration in Natural Language via LLM Agent

Authors: Farhad Nooralahzadeh, Yi Zhang, Jonathan Furst, Kurt Stockinger

Abstract: International enterprises, organizations, or hospitals collect large amounts of multi-modal data stored in databases, text documents, images, and videos. While there has been recent progress in the separate fields of multi-modal data exploration as well as in database systems that automatically translate natural language questions to database query languages, the research challenge of querying dat… ▽ More International enterprises, organizations, or hospitals collect large amounts of multi-modal data stored in databases, text documents, images, and videos. While there has been recent progress in the separate fields of multi-modal data exploration as well as in database systems that automatically translate natural language questions to database query languages, the research challenge of querying database systems combined with other unstructured modalities such as images in natural language is widely unexplored. In this paper, we propose XMODE - a system that enables explainable, multi-modal data exploration in natural language. Our approach is based on the following research contributions: (1) Our system is inspired by a real-world use case that enables users to explore multi-modal information systems. (2) XMODE leverages a LLM-based agentic AI framework to decompose a natural language question into subtasks such as text-to-SQL generation and image analysis. (3) Experimental results on multi-modal datasets over relational data and images demonstrate that our system outperforms state-of-the-art multi-modal exploration systems, excelling not only in accuracy but also in various performance metrics such as query latency, API costs, planning efficiency, and explanation quality, thanks to the more effective utilization of the reasoning capabilities of LLMs. △ Less

Submitted 24 December, 2024; originally announced December 2024.

arXiv:2411.05521 [pdf, other]

SM3-Text-to-Query: Synthetic Multi-Model Medical Text-to-Query Benchmark

Authors: Sithursan Sivasubramaniam, Cedric Osei-Akoto, Yi Zhang, Kurt Stockinger, Jonathan Fuerst

Abstract: Electronic health records (EHRs) are stored in various database systems with different database models on heterogeneous storage architectures, such as relational databases, document stores, or graph databases. These different database models have a big impact on query complexity and performance. While this has been a known fact in database research, its implications for the growing number of Text-… ▽ More Electronic health records (EHRs) are stored in various database systems with different database models on heterogeneous storage architectures, such as relational databases, document stores, or graph databases. These different database models have a big impact on query complexity and performance. While this has been a known fact in database research, its implications for the growing number of Text-to-Query systems have surprisingly not been investigated so far. In this paper, we present SM3-Text-to-Query, the first multi-model medical Text-to-Query benchmark based on synthetic patient data from Synthea, following the SNOMED-CT taxonomy -- a widely used knowledge graph ontology covering medical terminology. SM3-Text-to-Query provides data representations for relational databases (PostgreSQL), document stores (MongoDB), and graph databases (Neo4j and GraphDB (RDF)), allowing the evaluation across four popular query languages, namely SQL, MQL, Cypher, and SPARQL. We systematically and manually develop 408 template questions, which we augment to construct a benchmark of 10K diverse natural language question/query pairs for these four query languages (40K pairs overall). On our dataset, we evaluate several common in-context-learning (ICL) approaches for a set of representative closed and open-source LLMs. Our evaluation sheds light on the trade-offs between database models and query languages for different ICL strategies and LLMs. Last, SM3-Text-to-Query is easily extendable to additional query languages or real, standard-based patient databases. △ Less

Submitted 14 November, 2024; v1 submitted 8 November, 2024; originally announced November 2024.

Comments: NeurIPS 2024 Track Datasets and Benchmarks

arXiv:2409.18596 [pdf, ps, other]

doi 10.1145/3649409.3691083

ASAG2024: A Combined Benchmark for Short Answer Grading

Authors: Gérôme Meyer, Philip Breuer, Jonathan Fürst

Abstract: Open-ended questions test a more thorough understanding than closed-ended questions and are often a preferred assessment method. However, open-ended questions are tedious to grade and subject to personal bias. Therefore, there have been efforts to speed up the grading process through automation. Short Answer Grading (SAG) systems aim to automatically score students' answers. Despite growth in SAG… ▽ More Open-ended questions test a more thorough understanding than closed-ended questions and are often a preferred assessment method. However, open-ended questions are tedious to grade and subject to personal bias. Therefore, there have been efforts to speed up the grading process through automation. Short Answer Grading (SAG) systems aim to automatically score students' answers. Despite growth in SAG methods and capabilities, there exists no comprehensive short-answer grading benchmark across different subjects, grading scales, and distributions. Thus, it is hard to assess the capabilities of current automated grading methods in terms of their generalizability. In this preliminary work, we introduce the combined ASAG2024 benchmark to facilitate the comparison of automated grading systems. Combining seven commonly used short-answer grading datasets in a common structure and grading scale. For our benchmark, we evaluate a set of recent SAG methods, revealing that while LLM-based approaches reach new high scores, they still are far from reaching human performance. This opens up avenues for future research on human-machine SAG systems. △ Less

Submitted 27 September, 2024; originally announced September 2024.

Comments: Accepted at SIGCSE-Virtual 2024

arXiv:2409.10776 [pdf, other]

doi 10.2478/jdis-2024-0019

Research evolution of metal organic frameworks: A scientometric approach with human-in-the-loop

Authors: Xintong Zhao, Kyle Langlois, Jacob Furst, Yuan An, Xiaohua Hu, Diego Gomez Gualdron, Fernando Uribe-Romo, Jane Greenberg

Abstract: This paper reports on a scientometric analysis bolstered by human in the loop, domain experts, to examine the field of metal organic frameworks (MOFs) research. Scientometric analyses reveal the intellectual landscape of a field. The study engaged MOF scientists in the design and review of our research workflow. MOF materials are an essential component in next generation renewable energy storage a… ▽ More This paper reports on a scientometric analysis bolstered by human in the loop, domain experts, to examine the field of metal organic frameworks (MOFs) research. Scientometric analyses reveal the intellectual landscape of a field. The study engaged MOF scientists in the design and review of our research workflow. MOF materials are an essential component in next generation renewable energy storage and biomedical technologies. The research approach demonstrates how engaging experts, via human in the loop processes, can help develop a comprehensive view of a field research trends, influential works, and specialized topics. △ Less

Submitted 16 September, 2024; originally announced September 2024.

arXiv:2404.07663 [pdf, other]

Interactive Ontology Matching with Cost-Efficient Learning

Authors: Bin Cheng, Jonathan Fürst, Tobias Jacobs, Celia Garrido-Hidalgo

Abstract: The creation of high-quality ontologies is crucial for data integration and knowledge-based reasoning, specifically in the context of the rising data economy. However, automatic ontology matchers are often bound to the heuristics they are based on, leaving many matches unidentified. Interactive ontology matching systems involving human experts have been introduced, but they do not solve the fundam… ▽ More The creation of high-quality ontologies is crucial for data integration and knowledge-based reasoning, specifically in the context of the rising data economy. However, automatic ontology matchers are often bound to the heuristics they are based on, leaving many matches unidentified. Interactive ontology matching systems involving human experts have been introduced, but they do not solve the fundamental issue of flexibly finding additional matches outside the scope of the implemented heuristics, even though this is highly demanded in industrial settings. Active machine learning methods appear to be a promising path towards a flexible interactive ontology matcher. However, off-the-shelf active learning mechanisms suffer from low query efficiency due to extreme class imbalance, resulting in a last-mile problem where high human effort is required to identify the remaining matches. To address the last-mile problem, this work introduces DualLoop, an active learning method tailored to ontology matching. DualLoop offers three main contributions: (1) an ensemble of tunable heuristic matchers, (2) a short-term learner with a novel query strategy adapted to highly imbalanced data, and (3) long-term learners to explore potential matches by creating and tuning new heuristics. We evaluated DualLoop on three datasets of varying sizes and domains. Compared to existing active learning methods, we consistently achieved better F1 scores and recall, reducing the expected query cost spent on finding 90% of all matches by over 50%. Compared to traditional interactive ontology matchers, we are able to find additional, last-mile matches. Finally, we detail the successful deployment of our approach within an actual product and report its operational performance results within the Architecture, Engineering, and Construction (AEC) industry sector, showcasing its practical value and efficiency. △ Less

Submitted 11 April, 2024; originally announced April 2024.

arXiv:2402.08349 [pdf, other]

Evaluating the Data Model Robustness of Text-to-SQL Systems Based on Real User Queries

Authors: Jonathan Fürst, Catherine Kosten, Farhad Nooralahzadeh, Yi Zhang, Kurt Stockinger

Abstract: Text-to-SQL systems (also known as NL-to-SQL systems) have become an increasingly popular solution for bridging the gap between user capabilities and SQL-based data access. These systems translate user requests in natural language to valid SQL statements for a specific database. Recent Text-to-SQL systems have benefited from the rapid improvement of transformer-based language models. However, whil… ▽ More Text-to-SQL systems (also known as NL-to-SQL systems) have become an increasingly popular solution for bridging the gap between user capabilities and SQL-based data access. These systems translate user requests in natural language to valid SQL statements for a specific database. Recent Text-to-SQL systems have benefited from the rapid improvement of transformer-based language models. However, while Text-to-SQL systems that incorporate such models continuously reach new high scores on -- often synthetic -- benchmark datasets, a systematic exploration of their robustness towards different data models in a real-world, realistic scenario is notably missing. This paper provides the first in-depth evaluation of the data model robustness of Text-to-SQL systems in practice based on a multi-year international project focused on Text-to-SQL interfaces. Our evaluation is based on a real-world deployment of FootballDB, a system that was deployed over a 9 month period in the context of the FIFA World Cup 2022, during which about 6K natural language questions were asked and executed. All of our data is based on real user questions that were asked live to the system. We manually labeled and translated a subset of these questions for three different data models. For each data model, we explore the performance of representative Text-to-SQL systems and language models. We further quantify the impact of training data size, pre-, and post-processing steps as well as language model inference time. Our comprehensive evaluation sheds light on the design choices of real-world Text-to-SQL systems and their impact on moving from research prototypes to real deployments. Last, we provide a new benchmark dataset to the community, which is the first to enable the evaluation of different data models for the same dataset and is substantially more challenging than most previous datasets in terms of query complexity. △ Less

Submitted 29 November, 2024; v1 submitted 13 February, 2024; originally announced February 2024.

arXiv:2310.12417 [pdf, other]

Metadata for Scientific Experiment Reporting: A Case Study in Metal-Organic Frameworks

Authors: Xintong Zhao, Kyle Langlois, Jacob Furst, Scott McClellan, Xiaohua Hu, Yuan An, Diego A. Gómez-Gualdrón, Fernando J. Uribe-Romo, Jane Greenberg

Abstract: Research methods and procedures are core aspects of the research process. Metadata focused on these components is critical to supporting the FAIR principles, particularly reproducibility. The research reported on in this paper presents a methodological framework for metadata documentation supporting the reproducibility of research producing Metal Organic Frameworks (MOFs). The MOF case study invol… ▽ More Research methods and procedures are core aspects of the research process. Metadata focused on these components is critical to supporting the FAIR principles, particularly reproducibility. The research reported on in this paper presents a methodological framework for metadata documentation supporting the reproducibility of research producing Metal Organic Frameworks (MOFs). The MOF case study involved natural language processing to extract key synthesis experiment information from a corpus of research literature. Following, a classification activity was performed by domain experts to identify entity-relation pairs. Results include: 1) a research framework for metadata design, 2) a metadata schema that includes nine entities and two relationships for reporting MOF synthesis experiments, and 3) a growing database of MOF synthesis reports structured by our metadata scheme. The metadata schema is intended to support discovery and reproducibility of metal-organic framework research and the FAIR principles. The paper provides background information, identifies the research goals and objectives, research design, results, a discussion, and the conclusion. △ Less

Submitted 18 October, 2023; originally announced October 2023.

Comments: Accepted by the 17th International Conference on Metadata and Semantics Research

arXiv:2309.11361 [pdf, other]

Knowledge Graph Question Answering for Materials Science (KGQA4MAT): Developing Natural Language Interface for Metal-Organic Frameworks Knowledge Graph (MOF-KG) Using LLM

Authors: Yuan An, Jane Greenberg, Alex Kalinowski, Xintong Zhao, Xiaohua Hu, Fernando J. Uribe-Romo, Kyle Langlois, Jacob Furst, Diego A. Gómez-Gualdrón

Abstract: We present a comprehensive benchmark dataset for Knowledge Graph Question Answering in Materials Science (KGQA4MAT), with a focus on metal-organic frameworks (MOFs). A knowledge graph for metal-organic frameworks (MOF-KG) has been constructed by integrating structured databases and knowledge extracted from the literature. To enhance MOF-KG accessibility for domain experts, we aim to develop a natu… ▽ More We present a comprehensive benchmark dataset for Knowledge Graph Question Answering in Materials Science (KGQA4MAT), with a focus on metal-organic frameworks (MOFs). A knowledge graph for metal-organic frameworks (MOF-KG) has been constructed by integrating structured databases and knowledge extracted from the literature. To enhance MOF-KG accessibility for domain experts, we aim to develop a natural language interface for querying the knowledge graph. We have developed a benchmark comprised of 161 complex questions involving comparison, aggregation, and complicated graph structures. Each question is rephrased in three additional variations, resulting in 644 questions and 161 KG queries. To evaluate the benchmark, we have developed a systematic approach for utilizing the LLM, ChatGPT, to translate natural language questions into formal KG queries. We also apply the approach to the well-known QALD-9 dataset, demonstrating ChatGPT's potential in addressing KGQA issues for different platforms and query languages. The benchmark and the proposed approach aim to stimulate further research and development of user-friendly and efficient interfaces for querying domain-specific materials science knowledge graphs, thereby accelerating the discovery of novel materials. △ Less

Submitted 6 June, 2024; v1 submitted 20 September, 2023; originally announced September 2023.

Comments: In 17th International Conference on Metadata and Semantics Research, October 2023

arXiv:2209.14454 [pdf]

CompNet: A Designated Model to Handle Combinations of Images and Designed features

Authors: Bowen Qiu, Daniela Raicu, Jacob Furst, Roselyne Tchoua

Abstract: Convolutional neural networks (CNNs) are one of the most popular models of Artificial Neural Networks (ANN)s in Computer Vision (CV). A variety of CNN-based structures were developed by researchers to solve problems like image classification, object detection, and image similarity measurement. Although CNNs have shown their value in most cases, they still have a downside: they easily overfit when… ▽ More Convolutional neural networks (CNNs) are one of the most popular models of Artificial Neural Networks (ANN)s in Computer Vision (CV). A variety of CNN-based structures were developed by researchers to solve problems like image classification, object detection, and image similarity measurement. Although CNNs have shown their value in most cases, they still have a downside: they easily overfit when there are not enough samples in the dataset. Most medical image datasets are examples of such a dataset. Additionally, many datasets also contain both designed features and images, but CNNs can only deal with images directly. This represents a missed opportunity to leverage additional information. For this reason, we propose a new structure of CNN-based model: CompNet, a composite convolutional neural network. This is a specially designed neural network that accepts combinations of images and designed features as input in order to leverage all available information. The novelty of this structure is that it uses learned features from images to weight designed features in order to gain all information from both images and designed features. With the use of this structure on classification tasks, the results indicate that our approach has the capability to significantly reduce overfitting. Furthermore, we also found several similar approaches proposed by other researchers that can combine images and designed features. To make comparison, we first applied those similar approaches on LIDC and compared the results with the CompNet results, then we applied our CompNet on the datasets that those similar approaches originally used in their works and compared the results with the results they proposed in their papers. All these comparison results showed that our model outperformed those similar approaches on classification tasks either on LIDC dataset or on their proposed datasets. △ Less

Submitted 28 September, 2022; originally announced September 2022.

arXiv:2207.04502 [pdf, other]

Building Open Knowledge Graph for Metal-Organic Frameworks (MOF-KG): Challenges and Case Studies

Authors: Yuan An, Jane Greenberg, Xintong Zhao, Xiaohua Hu, Scott McCLellan, Alex Kalinowski, Fernando J. Uribe-Romo, Kyle Langlois, Jacob Furst, Diego A. Gómez-Gualdrón, Fernando Fajardo-Rojas, Katherine Ardila

Abstract: Metal-Organic Frameworks (MOFs) are a class of modular, porous crystalline materials that have great potential to revolutionize applications such as gas storage, molecular separations, chemical sensing, catalysis, and drug delivery. The Cambridge Structural Database (CSD) reports 10,636 synthesized MOF crystals which in addition contains ca. 114,373 MOF-like structures. The sheer number of synthes… ▽ More Metal-Organic Frameworks (MOFs) are a class of modular, porous crystalline materials that have great potential to revolutionize applications such as gas storage, molecular separations, chemical sensing, catalysis, and drug delivery. The Cambridge Structural Database (CSD) reports 10,636 synthesized MOF crystals which in addition contains ca. 114,373 MOF-like structures. The sheer number of synthesized (plus potentially synthesizable) MOF structures requires researchers pursue computational techniques to screen and isolate MOF candidates. In this demo paper, we describe our effort on leveraging knowledge graph methods to facilitate MOF prediction, discovery, and synthesis. We present challenges and case studies about (1) construction of a MOF knowledge graph (MOF-KG) from structured and unstructured sources and (2) leveraging the MOF-KG for discovery of new or missing knowledge. △ Less

Submitted 29 November, 2023; v1 submitted 10 July, 2022; originally announced July 2022.

Comments: Accepted by the International Workshop on Knowledge Graphs and Open Knowledge Network (OKN'22) Co-located with the 28th ACM SIGKDD Conference

arXiv:2205.10900 [pdf, other]

Visual Explanations from Deep Networks via Riemann-Stieltjes Integrated Gradient-based Localization

Authors: Mirtha Lucas, Miguel Lerma, Jacob Furst, Daniela Raicu

Abstract: Neural networks are becoming increasingly better at tasks that involve classifying and recognizing images. At the same time techniques intended to explain the network output have been proposed. One such technique is the Gradient-based Class Activation Map (Grad-CAM), which is able to locate features of an input image at various levels of a convolutional neural network (CNN), but is sensitive to th… ▽ More Neural networks are becoming increasingly better at tasks that involve classifying and recognizing images. At the same time techniques intended to explain the network output have been proposed. One such technique is the Gradient-based Class Activation Map (Grad-CAM), which is able to locate features of an input image at various levels of a convolutional neural network (CNN), but is sensitive to the vanishing gradients problem. There are techniques such as Integrated Gradients (IG), that are not affected by that problem, but its use is limited to the input layer of a network. Here we introduce a new technique to produce visual explanations for the predictions of a CNN. Like Grad-CAM, our method can be applied to any layer of the network, and like Integrated Gradients it is not affected by the problem of vanishing gradients. For efficiency, gradient integration is performed numerically at the layer level using a Riemann-Stieltjes sum approximation. Compared to Grad-CAM, heatmaps produced by our algorithm are better focused in the areas of interest, and their numerical computation is more stable. Our code is available at https://github.com/mlerma54/RSIGradCAM △ Less

Submitted 22 May, 2022; originally announced May 2022.

Comments: 16 pages, 33 figures

MSC Class: 68T45 ACM Class: I.2.m; I.4.m

arXiv:2005.12848 [pdf, other]

Group-In: Group Inference from Wireless Traces of Mobile Devices

Authors: Gürkan Solmaz, Jonathan Fürst, Samet Aytaç, Fang-Jing Wu

Abstract: This paper proposes Group-In, a wireless scanning system to detect static or mobile people groups in indoor or outdoor environments. Group-In collects only wireless traces from the Bluetooth-enabled mobile devices for group inference. The key problem addressed in this work is to detect not only static groups but also moving groups with a multi-phased approach based only noisy wireless Received Sig… ▽ More This paper proposes Group-In, a wireless scanning system to detect static or mobile people groups in indoor or outdoor environments. Group-In collects only wireless traces from the Bluetooth-enabled mobile devices for group inference. The key problem addressed in this work is to detect not only static groups but also moving groups with a multi-phased approach based only noisy wireless Received Signal Strength Indicator (RSSIs) observed by multiple wireless scanners without localization support. We propose new centralized and decentralized schemes to process the sparse and noisy wireless data, and leverage graph-based clustering techniques for group detection from short-term and long-term aspects. Group-In provides two outcomes: 1) group detection in short time intervals such as two minutes and 2) long-term linkages such as a month. To verify the performance, we conduct two experimental studies. One consists of 27 controlled scenarios in the lab environments. The other is a real-world scenario where we place Bluetooth scanners in an office environment, and employees carry beacons for more than one month. Both the controlled and real-world experiments result in high accuracy group detection in short time intervals and sampling liberties in terms of the Jaccard index and pairwise similarity coefficient. △ Less

Submitted 10 June, 2020; v1 submitted 26 May, 2020; originally announced May 2020.

Comments: This work has been funded by the EU Horizon 2020 Programme under Grant Agreements No. 731993 AUTOPILOT and No.871249 LOCUS projects. The content of this paper does not reflect the official opinion of the EU. Responsibility for the information and views expressed therein lies entirely with the authors. Proc. of ACM/IEEE IPSN'20, 2020

arXiv:1907.08278 [pdf, other]

Fog Function: Serverless Fog Computing for Data Intensive IoT Services

Authors: Bin Cheng, Jonathan Fürst, Gurkan Solmaz, Takuya Sanada

Abstract: Fog computing can support IoT services with fast response time and low bandwidth usage by moving computation from the cloud to edge devices. However, existing fog computing frameworks have limited flexibility to support dynamic service composition with a data-oriented approach. Function-as-a-Service (FaaS) is a promising programming model for fog computing to enhance flexibility, but the current e… ▽ More Fog computing can support IoT services with fast response time and low bandwidth usage by moving computation from the cloud to edge devices. However, existing fog computing frameworks have limited flexibility to support dynamic service composition with a data-oriented approach. Function-as-a-Service (FaaS) is a promising programming model for fog computing to enhance flexibility, but the current event- or topic-based design of function triggering and the separation of data management and function execution result in inefficiency for data-intensive IoT services. To achieve both flexibility and efficiency, we propose a data-centric programming model called Fog Function and also introduce its underlying orchestration mechanism that leverages three types of contexts: data context, system context, and usage context. Moreover, we showcase a concrete use case for smart parking where Fog Function allows service developers to easily model their service logic with reduced learning efforts compared to a static service topology. Our performance evaluation results show that the Fog Function can be scaled to hundreds of fog nodes. Fog Function can improve system efficiency by saving 95% of the internal data traffic over cloud function and it can reduce service latency by 30% over edge function. △ Less

Submitted 18 July, 2019; originally announced July 2019.

arXiv:1904.12676 [pdf, other]

Reinforcement Learning Based Orchestration for Elastic Services

Authors: M. Fadel Argerich, B. Cheng, J. Fürst

Abstract: Due to the highly variable execution context in which edge services run, adapting their behavior to the execution context is crucial to comply with their requirements. However, adapting service behavior is a challenging task because it is hard to anticipate the execution contexts in which it will be deployed, as well as assessing the impact that each behavior change will produce. In order to provi… ▽ More Due to the highly variable execution context in which edge services run, adapting their behavior to the execution context is crucial to comply with their requirements. However, adapting service behavior is a challenging task because it is hard to anticipate the execution contexts in which it will be deployed, as well as assessing the impact that each behavior change will produce. In order to provide this adaptation efficiently, we propose a Reinforcement Learning (RL) based Orchestration for Elastic Services. We implement and evaluate this approach by adapting an elastic service in different simulated execution contexts and comparing its performance to a Heuristics based approach. We show that elastic services achieve high precision and requirement satisfaction rates while creating an overhead of less than 0.5% to the overall service. In particular, the RL approach proves to be more efficient than its rule-based counterpart; yielding a 10 to 25% higher precision while being 25% less computationally expensive. △ Less

Submitted 26 April, 2019; originally announced April 2019.

Comments: 2019 IEEE 5th World Forum on Internet of Things (WF-IoT), 6 pages

arXiv:1807.02608 [pdf]

Synthetic Sampling for Multi-Class Malignancy Prediction

Authors: Matthew Yung, Eli T. Brown, Alexander Rasin, Jacob D. Furst, Daniela S. Raicu

Abstract: We explore several oversampling techniques for an imbalanced multi-label classification problem, a setting often encountered when developing models for Computer-Aided Diagnosis (CADx) systems. While most CADx systems aim to optimize classifiers for overall accuracy without considering the relative distribution of each class, we look into using synthetic sampling to increase per-class performance w… ▽ More We explore several oversampling techniques for an imbalanced multi-label classification problem, a setting often encountered when developing models for Computer-Aided Diagnosis (CADx) systems. While most CADx systems aim to optimize classifiers for overall accuracy without considering the relative distribution of each class, we look into using synthetic sampling to increase per-class performance when predicting the degree of malignancy. Using low-level image features and a random forest classifier, we show that using synthetic oversampling techniques increases the sensitivity of the minority classes by an average of 7.22% points, with as much as a 19.88% point increase in sensitivity for a particular minority class. Furthermore, the analysis of low-level image feature distributions for the synthetic nodules reveals that these nodules can provide insights on how to preprocess image data for better classification performance or how to supplement the original datasets when more data acquisition is feasible. △ Less

Submitted 6 July, 2018; originally announced July 2018.

Comments: 5 pages, 3 figures, 4 Tables, KDD MLMH'18 Workshop

Showing 1–19 of 19 results for author: Fuerst, J