-
A Transfer Framework for Enhancing Temporal Graph Learning in Data-Scarce Settings
Authors:
Sidharth Agarwal,
Tanishq Dubey,
Shubham Gupta,
Srikanta Bedathur
Abstract:
Dynamic interactions between entities are prevalent in domains like social platforms, financial systems, healthcare, and e-commerce. These interactions can be effectively represented as time-evolving graphs, where predicting future connections is a key task in applications such as recommendation systems. Temporal Graph Neural Networks (TGNNs) have achieved strong results for such predictive tasks…
▽ More
Dynamic interactions between entities are prevalent in domains like social platforms, financial systems, healthcare, and e-commerce. These interactions can be effectively represented as time-evolving graphs, where predicting future connections is a key task in applications such as recommendation systems. Temporal Graph Neural Networks (TGNNs) have achieved strong results for such predictive tasks but typically require extensive training data, which is often limited in real-world scenarios. One approach to mitigating data scarcity is leveraging pre-trained models from related datasets. However, direct knowledge transfer between TGNNs is challenging due to their reliance on node-specific memory structures, making them inherently difficult to adapt across datasets.
To address this, we introduce a novel transfer approach that disentangles node representations from their associated features through a structured bipartite encoding mechanism. This decoupling enables more effective transfer of memory components and other learned inductive patterns from one dataset to another. Empirical evaluations on real-world benchmarks demonstrate that our method significantly enhances TGNN performance in low-data regimes, outperforming non-transfer baselines by up to 56\% and surpassing existing transfer strategies by 36\%
△ Less
Submitted 11 March, 2025; v1 submitted 2 March, 2025;
originally announced March 2025.
-
LegalSeg: Unlocking the Structure of Indian Legal Judgments Through Rhetorical Role Classification
Authors:
Shubham Kumar Nigam,
Tanmay Dubey,
Govind Sharma,
Noel Shallum,
Kripabandhu Ghosh,
Arnab Bhattacharya
Abstract:
In this paper, we address the task of semantic segmentation of legal documents through rhetorical role classification, with a focus on Indian legal judgments. We introduce LegalSeg, the largest annotated dataset for this task, comprising over 7,000 documents and 1.4 million sentences, labeled with 7 rhetorical roles. To benchmark performance, we evaluate multiple state-of-the-art models, including…
▽ More
In this paper, we address the task of semantic segmentation of legal documents through rhetorical role classification, with a focus on Indian legal judgments. We introduce LegalSeg, the largest annotated dataset for this task, comprising over 7,000 documents and 1.4 million sentences, labeled with 7 rhetorical roles. To benchmark performance, we evaluate multiple state-of-the-art models, including Hierarchical BiLSTM-CRF, TransformerOverInLegalBERT (ToInLegalBERT), Graph Neural Networks (GNNs), and Role-Aware Transformers, alongside an exploratory RhetoricLLaMA, an instruction-tuned large language model. Our results demonstrate that models incorporating broader context, structural relationships, and sequential sentence information outperform those relying solely on sentence-level features. Additionally, we conducted experiments using surrounding context and predicted or actual labels of neighboring sentences to assess their impact on classification accuracy. Despite these advancements, challenges persist in distinguishing between closely related roles and addressing class imbalance. Our work underscores the potential of advanced techniques for improving legal document understanding and sets a strong foundation for future research in legal NLP.
△ Less
Submitted 9 February, 2025;
originally announced February 2025.
-
Towards Optimizing the Costs of LLM Usage
Authors:
Shivanshu Shekhar,
Tanishq Dubey,
Koyel Mukherjee,
Apoorv Saxena,
Atharv Tyagi,
Nishanth Kotla
Abstract:
Generative AI and LLMs in particular are heavily used nowadays for various document processing tasks such as question answering and summarization. However, different LLMs come with different capabilities for different tasks as well as with different costs, tokenization, and latency. In fact, enterprises are already incurring huge costs of operating or using LLMs for their respective use cases.
I…
▽ More
Generative AI and LLMs in particular are heavily used nowadays for various document processing tasks such as question answering and summarization. However, different LLMs come with different capabilities for different tasks as well as with different costs, tokenization, and latency. In fact, enterprises are already incurring huge costs of operating or using LLMs for their respective use cases.
In this work, we propose optimizing the usage costs of LLMs by estimating their output quality (without actually invoking the LLMs), and then solving an optimization routine for the LLM selection to either keep costs under a budget, or minimize the costs, in a quality and latency aware manner. We propose a model to predict the output quality of LLMs on document processing tasks like summarization, followed by an LP rounding algorithm to optimize the selection of LLMs. We study optimization problems trading off the quality and costs, both theoretically and empirically. We further propose a sentence simplification model for reducing the number of tokens in a controlled manner. Additionally, we propose several deterministic heuristics for reducing tokens in a quality aware manner, and study the related optimization problem of applying the heuristics optimizing the quality and cost trade-off. We perform extensive empirical validation of our methods on not only enterprise datasets but also on open-source datasets, annotated by us, and show that we perform much better compared to closest baselines. Our methods reduce costs by 40%- 90% while improving quality by 4%-7%. We will release the annotated open source datasets to the community for further research and exploration.
△ Less
Submitted 29 January, 2024;
originally announced February 2024.
-
Dynamically Improving Branch Prediction Accuracy Between Contexts
Authors:
Adam Auten,
Tanishq Dubey,
Rohan Mathur
Abstract:
Branch prediction is a standard feature in most processors, significantly improving the run time of programs by allowing a processor to predict the direction of a branch before it has been evaluated. Current branch prediction methods can achieve excellent prediction accuracy through global tables, various hashing methods, and even machine learning techniques such as SVMs or neural networks. Such d…
▽ More
Branch prediction is a standard feature in most processors, significantly improving the run time of programs by allowing a processor to predict the direction of a branch before it has been evaluated. Current branch prediction methods can achieve excellent prediction accuracy through global tables, various hashing methods, and even machine learning techniques such as SVMs or neural networks. Such designs, however, may lose effectiveness when attempting to predict across context switches in the operating system. Such a scenario may lead to destructive interference between contexts, therefore reducing overall predictor accuracy. To solve this problem, we propose a novel scheme for deciding whether a context switch produces destructive or constructive interference. First, we present evidence that shows that destructive interference can have a significant negative impact on prediction accuracy. Second, we present an extensible framework that keeps track of context switches and prediction accuracy to improve overall accuracy. Experimental results show that this framework effectively reduces the effect of destructive interference on branch prediction.
△ Less
Submitted 1 May, 2018;
originally announced May 2018.