-
A Unified Framework for Provably Efficient Algorithms to Estimate Shapley Values
Authors:
Tyler Chen,
Akshay Seshadri,
Mattia J. Villani,
Pradeep Niroula,
Shouvanik Chakrabarti,
Archan Ray,
Pranav Deshpande,
Romina Yalovetzky,
Marco Pistoia,
Niraj Kumar
Abstract:
Shapley values have emerged as a critical tool for explaining which features impact the decisions made by machine learning models. However, computing exact Shapley values is difficult, generally requiring an exponential (in the feature dimension) number of model evaluations. To address this, many model-agnostic randomized estimators have been developed, the most influential and widely used being t…
▽ More
Shapley values have emerged as a critical tool for explaining which features impact the decisions made by machine learning models. However, computing exact Shapley values is difficult, generally requiring an exponential (in the feature dimension) number of model evaluations. To address this, many model-agnostic randomized estimators have been developed, the most influential and widely used being the KernelSHAP method (Lundberg & Lee, 2017). While related estimators such as unbiased KernelSHAP (Covert & Lee, 2021) and LeverageSHAP (Musco & Witter, 2025) are known to satisfy theoretical guarantees, bounds for KernelSHAP have remained elusive. We describe a broad and unified framework that encompasses KernelSHAP and related estimators constructed using both with and without replacement sampling strategies. We then prove strong non-asymptotic theoretical guarantees that apply to all estimators from our framework. This provides, to the best of our knowledge, the first theoretical guarantees for KernelSHAP and sheds further light on tradeoffs between existing estimators. Through comprehensive benchmarking on small and medium dimensional datasets for Decision-Tree models, we validate our approach against exact Shapley values, consistently achieving low mean squared error with modest sample sizes. Furthermore, we make specific implementation improvements to enable scalability of our methods to high-dimensional datasets. Our methods, tested on datasets such MNIST and CIFAR10, provide consistently better results compared to the KernelSHAP library.
△ Less
Submitted 5 June, 2025;
originally announced June 2025.
-
Provably faster randomized and quantum algorithms for $k$-means clustering via uniform sampling
Authors:
Tyler Chen,
Archan Ray,
Akshay Seshadri,
Dylan Herman,
Bao Bach,
Pranav Deshpande,
Abhishek Som,
Niraj Kumar,
Marco Pistoia
Abstract:
The $k$-means algorithm (Lloyd's algorithm) is a widely used method for clustering unlabeled data. A key bottleneck of the $k$-means algorithm is that each iteration requires time linear in the number of data points, which can be expensive in big data applications. This was improved in recent works proposing quantum and quantum-inspired classical algorithms to approximate the $k$-means algorithm l…
▽ More
The $k$-means algorithm (Lloyd's algorithm) is a widely used method for clustering unlabeled data. A key bottleneck of the $k$-means algorithm is that each iteration requires time linear in the number of data points, which can be expensive in big data applications. This was improved in recent works proposing quantum and quantum-inspired classical algorithms to approximate the $k$-means algorithm locally, in time depending only logarithmically on the number of data points (along with data dependent parameters) [$q$-means: A quantum algorithm for unsupervised machine learning; Kerenidis, Landman, Luongo, and Prakash, NeurIPS 2019; Do you know what $q$-means?, Doriguello, Luongo, Tang]. In this work, we describe a simple randomized mini-batch $k$-means algorithm and a quantum algorithm inspired by the classical algorithm. We prove worse-case guarantees that significantly improve upon the bounds for previous algorithms. Our improvements are due to a careful use of uniform sampling, which preserves certain symmetries of the $k$-means problem that are not preserved in previous algorithms that use data norm-based sampling.
△ Less
Submitted 1 May, 2025; v1 submitted 29 April, 2025;
originally announced April 2025.
-
One Jump Is All You Need: Short-Cutting Transformers for Early Exit Prediction with One Jump to Fit All Exit Levels
Authors:
Amrit Diggavi Seshadri
Abstract:
To reduce the time and computational costs of inference of large language models, there has been interest in parameter-efficient low-rank early-exit casting of transformer hidden-representations to final-representations. Such low-rank short-cutting has been shown to outperform identity shortcuts at early model stages while offering parameter-efficiency in shortcut jumps. However, current low-rank…
▽ More
To reduce the time and computational costs of inference of large language models, there has been interest in parameter-efficient low-rank early-exit casting of transformer hidden-representations to final-representations. Such low-rank short-cutting has been shown to outperform identity shortcuts at early model stages while offering parameter-efficiency in shortcut jumps. However, current low-rank methods maintain a separate early-exit shortcut jump to final-representations for each transformer intermediate block-level during inference. In this work, we propose selection of a single One-Jump-Fits-All (OJFA) low-rank shortcut that offers over a 30x reduction in shortcut parameter costs during inference. We show that despite this extreme reduction, our OJFA choice largely matches the performance of maintaining multiple shortcut jumps during inference and offers stable precision from all transformer block-levels for GPT2-XL, Phi3-Mini and Llama2-7B transformer models.
△ Less
Submitted 18 April, 2025;
originally announced April 2025.
-
Adaptive and Robust Watermark for Generative Tabular Data
Authors:
Dung Daniel Ngo,
Daniel Scott,
Saheed Obitayo,
Archan Ray,
Akshay Seshadri,
Niraj Kumar,
Vamsi K. Potluru,
Marco Pistoia,
Manuela Veloso
Abstract:
Recent development in generative models has demonstrated its ability to create high-quality synthetic data. However, the pervasiveness of synthetic content online also brings forth growing concerns that it can be used for malicious purpose. To ensure the authenticity of the data, watermarking techniques have recently emerged as a promising solution due to their strong statistical guarantees. In th…
▽ More
Recent development in generative models has demonstrated its ability to create high-quality synthetic data. However, the pervasiveness of synthetic content online also brings forth growing concerns that it can be used for malicious purpose. To ensure the authenticity of the data, watermarking techniques have recently emerged as a promising solution due to their strong statistical guarantees. In this paper, we propose a flexible and robust watermarking mechanism for generative tabular data. Specifically, a data provider with knowledge of the downstream tasks can partition the feature space into pairs of (key, value) columns. Within each pair, the data provider first uses elements in the key column to generate a randomized set of ``green'' intervals, then encourages elements of the value column to be in one of these ``green'' intervals. We show theoretically and empirically that the watermarked datasets (i) have negligible impact on the data quality and downstream utility, (ii) can be efficiently detected, (iii) are robust against multiple attacks commonly observed in data science, and (iv) maintain strong security against adversary attempting to learn the underlying watermark scheme.
△ Less
Submitted 6 June, 2025; v1 submitted 23 September, 2024;
originally announced September 2024.
-
Normalized Narrow Jump To Conclusions: Normalized Narrow Shortcuts for Parameter Efficient Early Exit Transformer Prediction
Authors:
Amrit Diggavi Seshadri
Abstract:
With the size and cost of large transformer-based language models growing, recently, there has been interest in shortcut casting of early transformer hidden-representations to final-representations for cheaper model inference. In particular, shortcutting pre-trained transformers with linear transformations over early layers has been shown to improve precision in early inference. However, for large…
▽ More
With the size and cost of large transformer-based language models growing, recently, there has been interest in shortcut casting of early transformer hidden-representations to final-representations for cheaper model inference. In particular, shortcutting pre-trained transformers with linear transformations over early layers has been shown to improve precision in early inference. However, for large language models, even this becomes computationally expensive. In this work, we propose Narrow Jump to Conclusions (NJTC) and Normalized Narrow Jump to Conclusions (N-NJTC) - parameter efficient alternatives to standard linear shortcutting that reduces shortcut parameter count by over 97%. We show that N-NJTC reliably outperforms Identity shortcuts at early stages and offers stable precision from all transformer block levels for GPT-2-XL, Phi3-Mini and Llama2-7B transformer models, demonstrating the viability of more parameter efficient short-cutting approaches.
△ Less
Submitted 3 October, 2024; v1 submitted 21 September, 2024;
originally announced September 2024.
-
B'MOJO: Hybrid State Space Realizations of Foundation Models with Eidetic and Fading Memory
Authors:
Luca Zancato,
Arjun Seshadri,
Yonatan Dukler,
Aditya Golatkar,
Yantao Shen,
Benjamin Bowman,
Matthew Trager,
Alessandro Achille,
Stefano Soatto
Abstract:
We describe a family of architectures to support transductive inference by allowing memory to grow to a finite but a-priori unknown bound while making efficient use of finite resources for inference. Current architectures use such resources to represent data either eidetically over a finite span ("context" in Transformers), or fading over an infinite span (in State Space Models, or SSMs). Recent h…
▽ More
We describe a family of architectures to support transductive inference by allowing memory to grow to a finite but a-priori unknown bound while making efficient use of finite resources for inference. Current architectures use such resources to represent data either eidetically over a finite span ("context" in Transformers), or fading over an infinite span (in State Space Models, or SSMs). Recent hybrid architectures have combined eidetic and fading memory, but with limitations that do not allow the designer or the learning process to seamlessly modulate the two, nor to extend the eidetic memory span. We leverage ideas from Stochastic Realization Theory to develop a class of models called B'MOJO to seamlessly combine eidetic and fading memory within an elementary composable module. The overall architecture can be used to implement models that can access short-term eidetic memory "in-context," permanent structural memory "in-weights," fading memory "in-state," and long-term eidetic memory "in-storage" by natively incorporating retrieval from an asynchronously updated memory. We show that Transformers, existing SSMs such as Mamba, and hybrid architectures such as Jamba are special cases of B'MOJO and describe a basic implementation, to be open sourced, that can be stacked and scaled efficiently in hardware. We test B'MOJO on transductive inference tasks, such as associative recall, where it outperforms existing SSMs and Hybrid models; as a baseline, we test ordinary language modeling where B'MOJO achieves perplexity comparable to similarly-sized Transformers and SSMs up to 1.4B parameters, while being up to 10% faster to train. Finally, we show that B'MOJO's ability to modulate eidetic and fading memory results in better inference on longer sequences tested up to 32K tokens, four-fold the length of the longest sequences seen during training.
△ Less
Submitted 8 July, 2024;
originally announced July 2024.
-
Diffusion Soup: Model Merging for Text-to-Image Diffusion Models
Authors:
Benjamin Biggs,
Arjun Seshadri,
Yang Zou,
Achin Jain,
Aditya Golatkar,
Yusheng Xie,
Alessandro Achille,
Ashwin Swaminathan,
Stefano Soatto
Abstract:
We present Diffusion Soup, a compartmentalization method for Text-to-Image Generation that averages the weights of diffusion models trained on sharded data. By construction, our approach enables training-free continual learning and unlearning with no additional memory or inference costs, since models corresponding to data shards can be added or removed by re-averaging. We show that Diffusion Soup…
▽ More
We present Diffusion Soup, a compartmentalization method for Text-to-Image Generation that averages the weights of diffusion models trained on sharded data. By construction, our approach enables training-free continual learning and unlearning with no additional memory or inference costs, since models corresponding to data shards can be added or removed by re-averaging. We show that Diffusion Soup samples from a point in weight space that approximates the geometric mean of the distributions of constituent datasets, which offers anti-memorization guarantees and enables zero-shot style mixing. Empirically, Diffusion Soup outperforms a paragon model trained on the union of all data shards and achieves a 30% improvement in Image Reward (.34 $\to$ .44) on domain sharded data, and a 59% improvement in IR (.37 $\to$ .59) on aesthetic data. In both cases, souping also prevails in TIFA score (respectively, 85.5 $\to$ 86.5 and 85.6 $\to$ 86.8). We demonstrate robust unlearning -- removing any individual domain shard only lowers performance by 1% in IR (.45 $\to$ .44) -- and validate our theoretical insights on anti-memorization using real data. Finally, we showcase Diffusion Soup's ability to blend the distinct styles of models finetuned on different shards, resulting in the zero-shot generation of hybrid styles.
△ Less
Submitted 12 June, 2024;
originally announced June 2024.
-
Bypassing Skip-Gram Negative Sampling: Dimension Regularization as a More Efficient Alternative for Graph Embeddings
Authors:
David Liu,
Arjun Seshadri,
Tina Eliassi-Rad,
Johan Ugander
Abstract:
A wide range of graph embedding objectives decompose into two components: one that enforces similarity, attracting the embeddings of nodes that are perceived as similar, and another that enforces dissimilarity, repelling the embeddings of nodes that are perceived as dissimilar. Without repulsion, the embeddings would collapse into trivial solutions. Skip-Gram Negative Sampling (SGNS) is a popular…
▽ More
A wide range of graph embedding objectives decompose into two components: one that enforces similarity, attracting the embeddings of nodes that are perceived as similar, and another that enforces dissimilarity, repelling the embeddings of nodes that are perceived as dissimilar. Without repulsion, the embeddings would collapse into trivial solutions. Skip-Gram Negative Sampling (SGNS) is a popular and efficient repulsion approach that prevents collapse by repelling each node from a sample of dissimilar nodes. In this work, we show that when repulsion is most needed and the embeddings approach collapse, SGNS node-wise repulsion is, in the aggregate, an approximate re-centering of the node embedding dimensions. Such dimension operations are more scalable than node operations and produce a simpler geometric interpretation of the repulsion. Our theoretical result establishes dimension regularization as an effective and more efficient, compared to skip-gram node contrast, approach to enforcing dissimilarity among embeddings of nodes. We use this result to propose a flexible algorithm augmentation framework that improves the scalability of any existing algorithm using SGNS. The framework prioritizes node attraction and replaces SGNS with dimension regularization. We instantiate this generic framework for LINE and node2vec and show that the augmented algorithms preserve downstream link-prediction performance while reducing GPU memory usage by up to 33.3% and training time by 23.4%. Moreover, we show that completely removing repulsion (a special case of our augmentation framework) in LINE reduces training time by 70.9% on average, while increasing link prediction performance, especially for graphs that are globally sparse but locally dense. In general, however, repulsion is needed, and dimension regularization provides an efficient alternative to SGNS.
△ Less
Submitted 2 June, 2025; v1 submitted 30 April, 2024;
originally announced May 2024.
-
Learning Rich Rankings
Authors:
Arjun Seshadri,
Stephen Ragain,
Johan Ugander
Abstract:
Although the foundations of ranking are well established, the ranking literature has primarily been focused on simple, unimodal models, e.g. the Mallows and Plackett-Luce models, that define distributions centered around a single total ordering. Explicit mixture models have provided some tools for modelling multimodal ranking data, though learning such models from data is often difficult. In this…
▽ More
Although the foundations of ranking are well established, the ranking literature has primarily been focused on simple, unimodal models, e.g. the Mallows and Plackett-Luce models, that define distributions centered around a single total ordering. Explicit mixture models have provided some tools for modelling multimodal ranking data, though learning such models from data is often difficult. In this work, we contribute a contextual repeated selection (CRS) model that leverages recent advances in choice modeling to bring a natural multimodality and richness to the rankings space. We provide rigorous theoretical guarantees for maximum likelihood estimation under the model through structure-dependent tail risk and expected risk bounds. As a by-product, we also furnish the first tight bounds on the expected risk of maximum likelihood estimators for the multinomial logit (MNL) choice model and the Plackett-Luce (PL) ranking model, as well as the first tail risk bound on the PL ranking model. The CRS model significantly outperforms existing methods for modeling real world ranking data in a variety of settings, from racing to rank choice voting.
△ Less
Submitted 22 December, 2023;
originally announced December 2023.
-
Reasoning over the Behaviour of Objects in Video-Clips for Adverb-Type Recognition
Authors:
Amrit Diggavi Seshadri,
Alessandra Russo
Abstract:
In this work, following the intuition that adverbs describing scene-sequences are best identified by reasoning over high-level concepts of object-behavior, we propose the design of a new framework that reasons over object-behaviours extracted from raw-video-clips to recognize the clip's corresponding adverb-types. Importantly, while previous works for general scene adverb-recognition assume knowle…
▽ More
In this work, following the intuition that adverbs describing scene-sequences are best identified by reasoning over high-level concepts of object-behavior, we propose the design of a new framework that reasons over object-behaviours extracted from raw-video-clips to recognize the clip's corresponding adverb-types. Importantly, while previous works for general scene adverb-recognition assume knowledge of the clips underlying action-types, our method is directly applicable in the more general problem setting where the action-type of a video-clip is unknown. Specifically, we propose a novel pipeline that extracts human-interpretable object-behaviour-facts from raw video clips and propose novel symbolic and transformer based reasoning methods that operate over these extracted facts to identify adverb-types. Experiment results demonstrate that our proposed methods perform favourably against the previous state-of-the-art. Additionally, to support efforts in symbolic video-processing, we release two new datasets of object-behaviour-facts extracted from raw video clips - the MSR-VTT-ASP and ActivityNet-ASP datasets.
△ Less
Submitted 27 March, 2024; v1 submitted 9 July, 2023;
originally announced July 2023.
-
HandsOff: Labeled Dataset Generation With No Additional Human Annotations
Authors:
Austin Xu,
Mariya I. Vasileva,
Achal Dave,
Arjun Seshadri
Abstract:
Recent work leverages the expressive power of generative adversarial networks (GANs) to generate labeled synthetic datasets. These dataset generation methods often require new annotations of synthetic images, which forces practitioners to seek out annotators, curate a set of synthetic images, and ensure the quality of generated labels. We introduce the HandsOff framework, a technique capable of pr…
▽ More
Recent work leverages the expressive power of generative adversarial networks (GANs) to generate labeled synthetic datasets. These dataset generation methods often require new annotations of synthetic images, which forces practitioners to seek out annotators, curate a set of synthetic images, and ensure the quality of generated labels. We introduce the HandsOff framework, a technique capable of producing an unlimited number of synthetic images and corresponding labels after being trained on less than 50 pre-existing labeled images. Our framework avoids the practical drawbacks of prior work by unifying the field of GAN inversion with dataset generation. We generate datasets with rich pixel-wise labels in multiple challenging domains such as faces, cars, full-body human poses, and urban driving scenes. Our method achieves state-of-the-art performance in semantic segmentation, keypoint detection, and depth estimation compared to prior dataset generation approaches and transfer learning baselines. We additionally showcase its ability to address broad challenges in model development which stem from fixed, hand-annotated datasets, such as the long-tail problem in semantic segmentation. Project page: austinxu87.github.io/handsoff.
△ Less
Submitted 30 March, 2023; v1 submitted 23 December, 2022;
originally announced December 2022.
-
RecXplainer: Amortized Attribute-based Personalized Explanations for Recommender Systems
Authors:
Sahil Verma,
Chirag Shah,
John P. Dickerson,
Anurag Beniwal,
Narayanan Sadagopan,
Arjun Seshadri
Abstract:
Recommender systems influence many of our interactions in the digital world -- impacting how we shop for clothes, sorting what we see when browsing YouTube or TikTok, and determining which restaurants and hotels we are shown when using hospitality platforms. Modern recommender systems are large, opaque models trained on a mixture of proprietary and open-source datasets. Naturally, issues of trust…
▽ More
Recommender systems influence many of our interactions in the digital world -- impacting how we shop for clothes, sorting what we see when browsing YouTube or TikTok, and determining which restaurants and hotels we are shown when using hospitality platforms. Modern recommender systems are large, opaque models trained on a mixture of proprietary and open-source datasets. Naturally, issues of trust arise on both the developer and user side: is the system working correctly, and why did a user receive (or not receive) a particular recommendation? Providing an explanation alongside a recommendation alleviates some of these concerns. The status quo for auxiliary recommender system feedback is either user-specific explanations (e.g., "users who bought item B also bought item A") or item-specific explanations (e.g., "we are recommending item A because you watched/bought item B"). However, users bring personalized context into their search experience, valuing an item as a function of that item's attributes and their own personal preferences. In this work, we propose RecXplainer, a novel method for generating fine-grained explanations based on a user's preferences over the attributes of recommended items. We evaluate RecXplainer on five real-world and large-scale recommendation datasets using five different kinds of recommender systems to demonstrate the efficacy of RecXplainer in capturing users' preferences over item attributes and using them to explain recommendations. We also compare RecXplainer to five baselines and show RecXplainer's exceptional performance on ten metrics.
△ Less
Submitted 29 August, 2023; v1 submitted 27 November, 2022;
originally announced November 2022.
-
Contrastive Learning for Interactive Recommendation in Fashion
Authors:
Karin Sevegnani,
Arjun Seshadri,
Tian Wang,
Anurag Beniwal,
Julian McAuley,
Alan Lu,
Gerard Medioni
Abstract:
Recommender systems and search are both indispensable in facilitating personalization and ease of browsing in online fashion platforms. However, the two tools often operate independently, failing to combine the strengths of recommender systems to accurately capture user tastes with search systems' ability to process user queries. We propose a novel remedy to this problem by automatically recommend…
▽ More
Recommender systems and search are both indispensable in facilitating personalization and ease of browsing in online fashion platforms. However, the two tools often operate independently, failing to combine the strengths of recommender systems to accurately capture user tastes with search systems' ability to process user queries. We propose a novel remedy to this problem by automatically recommending personalized fashion items based on a user-provided text request. Our proposed model, WhisperLite, uses contrastive learning to capture user intent from natural language text and improves the recommendation quality of fashion products. WhisperLite combines the strength of CLIP embeddings with additional neural network layers for personalization, and is trained using a composite loss function based on binary cross entropy and contrastive loss. The model demonstrates a significant improvement in offline recommendation retrieval metrics when tested on a real-world dataset collected from an online retail fashion store, as well as widely used open-source datasets in different e-commerce domains, such as restaurants, movies and TV shows, clothing and shoe reviews. We additionally conduct a user study that captures user judgements on the relevance of the model's recommended items, confirming the relevancy of WhisperLite's recommendations in an online setting.
△ Less
Submitted 25 July, 2022;
originally announced July 2022.
-
On the separation of correlation-assisted sum capacities of multiple access channels
Authors:
Akshay Seshadri,
Felix Leditzky,
Vikesh Siddhu,
Graeme Smith
Abstract:
The capacity of a channel characterizes the maximum rate at which information can be transmitted through the channel asymptotically faithfully. For a channel with multiple senders and a single receiver, computing its sum capacity is possible in theory, but challenging in practice because of the nonconvex optimization involved. To address this challenge, we investigate three topics in our study. In…
▽ More
The capacity of a channel characterizes the maximum rate at which information can be transmitted through the channel asymptotically faithfully. For a channel with multiple senders and a single receiver, computing its sum capacity is possible in theory, but challenging in practice because of the nonconvex optimization involved. To address this challenge, we investigate three topics in our study. In the first part, we study the sum capacity of a family of multiple access channels (MACs) obtained from nonlocal games. For any MAC in this family, we obtain an upper bound on the sum rate that depends only on the properties of the game when allowing assistance from an arbitrary set of correlations between the senders. This approach can be used to prove separations between sum capacities when the senders are allowed to share different sets of correlations, such as classical, quantum or no-signalling correlations. We also construct a specific nonlocal game to show that the approach of bounding the sum capacity by relaxing the nonconvex optimization can give arbitrarily loose bounds. Owing to this result, in the second part, we study algorithms for non-convex optimization of a class of functions we call Lipschitz-like functions. This class includes entropic quantities, and hence these results may be of independent interest in information theory. Subsequently, in the third part, we show that one can use these techniques to compute the sum capacity of an arbitrary two-sender MACs to a fixed additive precision in quasi-polynomial time. We showcase our method by efficiently computing the sum capacity of a family of two-sender MACs for which one of the input alphabets has size two. Furthermore, we demonstrate with an example that our algorithm may compute the sum capacity to a higher precision than using the convex relaxation.
△ Less
Submitted 3 August, 2023; v1 submitted 26 May, 2022;
originally announced May 2022.
-
Multi-Tailed, Multi-Headed, Spatial Dynamic Memory refined Text-to-Image Synthesis
Authors:
Amrit Diggavi Seshadri,
Balaraman Ravindran
Abstract:
Synthesizing high-quality, realistic images from text-descriptions is a challenging task, and current methods synthesize images from text in a multi-stage manner, typically by first generating a rough initial image and then refining image details at subsequent stages. However, existing methods that follow this paradigm suffer from three important limitations. Firstly, they synthesize initial image…
▽ More
Synthesizing high-quality, realistic images from text-descriptions is a challenging task, and current methods synthesize images from text in a multi-stage manner, typically by first generating a rough initial image and then refining image details at subsequent stages. However, existing methods that follow this paradigm suffer from three important limitations. Firstly, they synthesize initial images without attempting to separate image attributes at a word-level. As a result, object attributes of initial images (that provide a basis for subsequent refinement) are inherently entangled and ambiguous in nature. Secondly, by using common text-representations for all regions, current methods prevent us from interpreting text in fundamentally different ways at different parts of an image. Different image regions are therefore only allowed to assimilate the same type of information from text at each refinement stage. Finally, current methods generate refinement features only once at each refinement stage and attempt to address all image aspects in a single shot. This single-shot refinement limits the precision with which each refinement stage can learn to improve the prior image. Our proposed method introduces three novel components to address these shortcomings: (1) An initial generation stage that explicitly generates separate sets of image features for each word n-gram. (2) A spatial dynamic memory module for refinement of images. (3) An iterative multi-headed mechanism to make it easier to improve upon multiple image aspects. Experimental results demonstrate that our Multi-Headed Spatial Dynamic Memory image refinement with our Multi-Tailed Word-level Initial Generation (MSMT-GAN) performs favourably against the previous state of the art on the CUB and COCO datasets.
△ Less
Submitted 15 October, 2021;
originally announced October 2021.
-
Discovering Context Effects from Raw Choice Data
Authors:
Arjun Seshadri,
Alexander Peysakhovich,
Johan Ugander
Abstract:
Many applications in preference learning assume that decisions come from the maximization of a stable utility function. Yet a large experimental literature shows that individual choices and judgements can be affected by "irrelevant" aspects of the context in which they are made. An important class of such contexts is the composition of the choice set. In this work, our goal is to discover such cho…
▽ More
Many applications in preference learning assume that decisions come from the maximization of a stable utility function. Yet a large experimental literature shows that individual choices and judgements can be affected by "irrelevant" aspects of the context in which they are made. An important class of such contexts is the composition of the choice set. In this work, our goal is to discover such choice set effects from raw choice data. We introduce an extension of the Multinomial Logit (MNL) model, called the context dependent random utility model (CDM), which allows for a particular class of choice set effects. We show that the CDM can be thought of as a second-order approximation to a general choice system, can be inferred optimally using maximum likelihood and, importantly, is easily interpretable. We apply the CDM to both real and simulated choice data to perform principled exploratory analyses for the presence of choice set effects.
△ Less
Submitted 31 January, 2020; v1 submitted 8 February, 2019;
originally announced February 2019.
-
Consolidating the innovative concepts towards Exascale computing for Co-Design of Co-Applications ll: Co-Design Automation - Workload Characterization
Authors:
Dhanasekar,
Anirudh Seshadri,
Sudharshan Srinivasan,
Suryanarayanan,
Akash Sridhar
Abstract:
Many-core co-design is a complex task in which application complexity design space, heterogeneous many-core architecture design space, parallel programming language design space, simulator design space and optimizer design space should get integrated through a binding process and these design spaces, an ensemble of what is called many-core co-design spaces. It is indispensable to build a co-design…
▽ More
Many-core co-design is a complex task in which application complexity design space, heterogeneous many-core architecture design space, parallel programming language design space, simulator design space and optimizer design space should get integrated through a binding process and these design spaces, an ensemble of what is called many-core co-design spaces. It is indispensable to build a co-design automation process to dominate over the co-design complexity to cut down the turnaround time. The co-design automation is frame worked to comprehend the dependencies across the many-core co-design spaces and devise the logic behind these interdependencies using a set of algorithms. The software modules of these algorithms and the rest from the many-core co-design spaces interact to crop up the power-performance optimized heterogeneous many-core architecture specific for the simultaneous execution of co applications without space-time sharing. It is essential that such co-design automation has a built-in user-customizable workload generator to benchmark the emerging many-core architecture. This customizability benefits the generation of complex workloads with the desired computation complexity, communication complexity, control flow complexity, and locality of reference, specified under a distribution and established on quantitative models. In addition, the customizable workload model aids the generation of what is called computational and communication surges. None of the current day benchmark suites encompasses applications and kernels that can match the attributes of customizable workload model proposed in this paper. Aforementioned concepts are exemplified in, the case study supported by simulation results gathered from the simulator.
△ Less
Submitted 29 April, 2018;
originally announced June 2018.
-
On Uni Chord Free Graphs
Authors:
Mahati Kumar,
S. Manasvini,
N. Sadagopan,
Adithya Seshadri
Abstract:
A graph is unichord free if it does not contain a cycle with exactly one chord as its subgraph. In [3], it is shown that a graph is unichord free if and only if every minimal vertex separator is a stable set. In this paper, we first show that such a graph can be recognized in polynomial time. Further, we show that the chromatic number of unichord free graphs is one of (2,3, ω(G)). We also present…
▽ More
A graph is unichord free if it does not contain a cycle with exactly one chord as its subgraph. In [3], it is shown that a graph is unichord free if and only if every minimal vertex separator is a stable set. In this paper, we first show that such a graph can be recognized in polynomial time. Further, we show that the chromatic number of unichord free graphs is one of (2,3, ω(G)). We also present a polynomial-time algorithm to produce a coloring with ω(G) colors.
△ Less
Submitted 24 October, 2014; v1 submitted 8 November, 2013;
originally announced November 2013.
-
Onboard Dynamic Rail Track Safety Monitoring System
Authors:
Abhisekh Jain,
Arvind Seshadri,
Balaji B. S,
Ramviyas Parasuraman
Abstract:
This proposal aims at solving one of the long prevailing problems in the Indian Railways. This simple method of continuous monitoring and assessment of the condition of the rail tracks can prevent major disasters and save precious human lives. Our method is capable of alerting the train in case of any dislocations in the track or change in strength of the soil. Also it can avert the collisions of…
▽ More
This proposal aims at solving one of the long prevailing problems in the Indian Railways. This simple method of continuous monitoring and assessment of the condition of the rail tracks can prevent major disasters and save precious human lives. Our method is capable of alerting the train in case of any dislocations in the track or change in strength of the soil. Also it can avert the collisions of the train with other or with the vehicles trying to move across the unmanned level crossings.
△ Less
Submitted 21 October, 2014; v1 submitted 2 December, 2012;
originally announced December 2012.