-
STITCH-OPE: Trajectory Stitching with Guided Diffusion for Off-Policy Evaluation
Authors:
Hossein Goli,
Michael Gimelfarb,
Nathan Samuel de Lara,
Haruki Nishimura,
Masha Itkina,
Florian Shkurti
Abstract:
Off-policy evaluation (OPE) estimates the performance of a target policy using offline data collected from a behavior policy, and is crucial in domains such as robotics or healthcare where direct interaction with the environment is costly or unsafe. Existing OPE methods are ineffective for high-dimensional, long-horizon problems, due to exponential blow-ups in variance from importance weighting or…
▽ More
Off-policy evaluation (OPE) estimates the performance of a target policy using offline data collected from a behavior policy, and is crucial in domains such as robotics or healthcare where direct interaction with the environment is costly or unsafe. Existing OPE methods are ineffective for high-dimensional, long-horizon problems, due to exponential blow-ups in variance from importance weighting or compounding errors from learned dynamics models. To address these challenges, we propose STITCH-OPE, a model-based generative framework that leverages denoising diffusion for long-horizon OPE in high-dimensional state and action spaces. Starting with a diffusion model pre-trained on the behavior data, STITCH-OPE generates synthetic trajectories from the target policy by guiding the denoising process using the score function of the target policy. STITCH-OPE proposes two technical innovations that make it advantageous for OPE: (1) prevents over-regularization by subtracting the score of the behavior policy during guidance, and (2) generates long-horizon trajectories by stitching partial trajectories together end-to-end. We provide a theoretical guarantee that under mild assumptions, these modifications result in an exponential reduction in variance versus long-horizon trajectory diffusion. Experiments on the D4RL and OpenAI Gym benchmarks show substantial improvement in mean squared error, correlation, and regret metrics compared to state-of-the-art OPE methods.
△ Less
Submitted 27 May, 2025;
originally announced May 2025.
-
Taxonomic Trace Links: Rethinking Traceability and its Benefits
Authors:
Waleed Abdeen,
Michael Unterkalmsteiner,
Alexandros Chirtoglou,
Christoph Paul Schimanski,
Heja Goli,
Krzysztof Wnuk
Abstract:
Traceability greatly supports knowledge-intensive tasks, e.g., coverage check and impact analysis. Despite its clear benefits, the \emph{practical} implementation of traceability poses significant challenges, leading to a reduced focus on the creation and maintenance of trace links. We propose a new approach -- Taxonomic Trace Links (TTL) -- which rethinks traceability and its benefits. With TTL,…
▽ More
Traceability greatly supports knowledge-intensive tasks, e.g., coverage check and impact analysis. Despite its clear benefits, the \emph{practical} implementation of traceability poses significant challenges, leading to a reduced focus on the creation and maintenance of trace links. We propose a new approach -- Taxonomic Trace Links (TTL) -- which rethinks traceability and its benefits. With TTL, trace links are created indirectly through a domain-specific taxonomy, a simplified version of a domain model. TTL has the potential to address key traceability challenges, such as the granularity of trace links, the lack of a common data structure among software development artifacts, and unclear responsibility for traceability. We explain how TTL addresses these challenges and perform an initial validation with practitioners. We identified six challenges associated with TTL implementation that need to be addressed. Finally, we propose a research roadmap to further develop and evaluate the technical solution of TTL. TTL appears to be particularly feasible in practice where a domain taxonomy is already established
△ Less
Submitted 29 April, 2025;
originally announced April 2025.
-
Detecting Viral Social Events through Censored Observation with Deep Survival Analysis
Authors:
Maryam Ramezani,
Hossein Goli,
AmirMohammad Izad,
Hamid R. Rabiee
Abstract:
Users increasing activity across various social networks made it the most widely used platform for exchanging and propagating information among individuals. To spread information within a network, a user initially shared information on a social network, and then other users in direct contact with him might have shared that information. Information expanded throughout the network by repeatedly foll…
▽ More
Users increasing activity across various social networks made it the most widely used platform for exchanging and propagating information among individuals. To spread information within a network, a user initially shared information on a social network, and then other users in direct contact with him might have shared that information. Information expanded throughout the network by repeatedly following this process. A set of information that became popular and was repeatedly shared by different individuals was called viral events. Identifying and analyzing viral social events led to valuable insights into the dynamics of information dissemination within a network. However, more importantly, proactive approaches emerged. In other words, by observing the dissemination pattern of a piece of information in the early stages of expansion, it became possible to determine whether this cascade would become viral in the future. This research aimed to predict and detect viral events in social networks by observing granular information and using a deep survival analysis-based method. This model could play a significant role in identifying rumors, predicting the impact of information, and assisting in optimal decision-making in information management and marketing. Ultimately, the proposed method was tested on various real-world datasets from Twitter, Weibo, and Digg.
△ Less
Submitted 2 October, 2024;
originally announced October 2024.
-
Certified Adversarial Robustness via Partition-based Randomized Smoothing
Authors:
Hossein Goli,
Farzan Farnia
Abstract:
A reliable application of deep neural network classifiers requires robustness certificates against adversarial perturbations. Gaussian smoothing is a widely analyzed approach to certifying robustness against norm-bounded perturbations, where the certified prediction radius depends on the variance of the Gaussian noise and the confidence level of the neural net's prediction under the additive Gauss…
▽ More
A reliable application of deep neural network classifiers requires robustness certificates against adversarial perturbations. Gaussian smoothing is a widely analyzed approach to certifying robustness against norm-bounded perturbations, where the certified prediction radius depends on the variance of the Gaussian noise and the confidence level of the neural net's prediction under the additive Gaussian noise. However, in application to high-dimensional image datasets, the certified radius of the plain Gaussian smoothing could be relatively small, since Gaussian noise with high variances can significantly harm the visibility of an image. In this work, we propose the Pixel Partitioning-based Randomized Smoothing (PPRS) methodology to boost the neural net's confidence score and thus the robustness radius of the certified prediction. We demonstrate that the proposed PPRS algorithm improves the visibility of the images under additive Gaussian noise. We discuss the numerical results of applying PPRS to standard computer vision datasets and neural network architectures. Our empirical findings indicate a considerable improvement in the certified accuracy and stability of the prediction model to the additive Gaussian noise in randomized smoothing.
△ Less
Submitted 20 September, 2024;
originally announced September 2024.
-
Multi-Label Requirements Classification with Large Taxonomies
Authors:
Waleed Abdeen,
Michael Unterkalmsteiner,
Krzysztof Wnuk,
Alexandros Chirtoglou,
Christoph Schimanski,
Heja Goli
Abstract:
Classification aids software development activities by organizing requirements in classes for easier access and retrieval. The majority of requirements classification research has, so far, focused on binary or multi-class classification. Multi-label classification with large taxonomies could aid requirements traceability but is prohibitively costly with supervised training. Hence, we investigate z…
▽ More
Classification aids software development activities by organizing requirements in classes for easier access and retrieval. The majority of requirements classification research has, so far, focused on binary or multi-class classification. Multi-label classification with large taxonomies could aid requirements traceability but is prohibitively costly with supervised training. Hence, we investigate zero-short learning to evaluate the feasibility of multi-label requirements classification with large taxonomies. We associated, together with domain experts from the industry, 129 requirements with 769 labels from taxonomies ranging between 250 and 1183 classes. Then, we conducted a controlled experiment to study the impact of the type of classifier, the hierarchy, and the structural characteristics of taxonomies on the classification performance. The results show that: (1) The sentence-based classifier had a significantly higher recall compared to the word-based classifier; however, the precision and F1-score did not improve significantly. (2) The hierarchical classification strategy did not always improve the performance of requirements classification. (3) The total and leaf nodes of the taxonomies have a strong negative correlation with the recall of the hierarchical sentence-based classifier. We investigate the problem of multi-label requirements classification with large taxonomies, illustrate a systematic process to create a ground truth involving industry participants, and provide an analysis of different classification pipelines using zero-shot learning.
△ Less
Submitted 23 April, 2025; v1 submitted 7 June, 2024;
originally announced June 2024.
-
3D Deep Learning on Medical Images: A Review
Authors:
Satya P. Singh,
Lipo Wang,
Sukrit Gupta,
Haveesh Goli,
Parasuraman Padmanabhan,
Balázs Gulyás
Abstract:
The rapid advancements in machine learning, graphics processing technologies and the availability of medical imaging data have led to a rapid increase in the use of deep learning models in the medical domain. This was exacerbated by the rapid advancements in convolutional neural network (CNN) based architectures, which were adopted by the medical imaging community to assist clinicians in disease d…
▽ More
The rapid advancements in machine learning, graphics processing technologies and the availability of medical imaging data have led to a rapid increase in the use of deep learning models in the medical domain. This was exacerbated by the rapid advancements in convolutional neural network (CNN) based architectures, which were adopted by the medical imaging community to assist clinicians in disease diagnosis. Since the grand success of AlexNet in 2012, CNNs have been increasingly used in medical image analysis to improve the efficiency of human clinicians. In recent years, three-dimensional (3D) CNNs have been employed for the analysis of medical images. In this paper, we trace the history of how the 3D CNN was developed from its machine learning roots, we provide a brief mathematical description of 3D CNN and provide the preprocessing steps required for medical images before feeding them to 3D CNNs. We review the significant research in the field of 3D medical imaging analysis using 3D CNNs (and its variants) in different medical areas such as classification, segmentation, detection and localization. We conclude by discussing the challenges associated with the use of 3D CNNs in the medical imaging domain (and the use of deep learning models in general) and possible future trends in the field.
△ Less
Submitted 13 October, 2020; v1 submitted 31 March, 2020;
originally announced April 2020.