-
Heracles: A Hybrid SSM-Transformer Model for High-Resolution Image and Time-Series Analysis
Authors:
Badri N. Patro,
Suhas Ranganath,
Vinay P. Namboodiri,
Vijay S. Agneeswaran
Abstract:
Transformers have revolutionized image modeling tasks with adaptations like DeIT, Swin, SVT, Biformer, STVit, and FDVIT. However, these models often face challenges with inductive bias and high quadratic complexity, making them less efficient for high-resolution images. State space models (SSMs) such as Mamba, V-Mamba, ViM, and SiMBA offer an alternative to handle high resolution images in compute…
▽ More
Transformers have revolutionized image modeling tasks with adaptations like DeIT, Swin, SVT, Biformer, STVit, and FDVIT. However, these models often face challenges with inductive bias and high quadratic complexity, making them less efficient for high-resolution images. State space models (SSMs) such as Mamba, V-Mamba, ViM, and SiMBA offer an alternative to handle high resolution images in computer vision tasks. These SSMs encounter two major issues. First, they become unstable when scaled to large network sizes. Second, although they efficiently capture global information in images, they inherently struggle with handling local information. To address these challenges, we introduce Heracles, a novel SSM that integrates a local SSM, a global SSM, and an attention-based token interaction module. Heracles leverages a Hartely kernel-based state space model for global image information, a localized convolutional network for local details, and attention mechanisms in deeper layers for token interactions. Our extensive experiments demonstrate that Heracles-C-small achieves state-of-the-art performance on the ImageNet dataset with 84.5\% top-1 accuracy. Heracles-C-Large and Heracles-C-Huge further improve accuracy to 85.9\% and 86.4\%, respectively. Additionally, Heracles excels in transfer learning tasks on datasets such as CIFAR-10, CIFAR-100, Oxford Flowers, and Stanford Cars, and in instance segmentation on the MSCOCO dataset. Heracles also proves its versatility by achieving state-of-the-art results on seven time-series datasets, showcasing its ability to generalize across domains with spectral data, capturing both local and global information. The project page is available at this link.\url{https://github.com/badripatro/heracles}
△ Less
Submitted 3 June, 2024; v1 submitted 26 March, 2024;
originally announced March 2024.
-
MASSIVE: A 1M-Example Multilingual Natural Language Understanding Dataset with 51 Typologically-Diverse Languages
Authors:
Jack FitzGerald,
Christopher Hench,
Charith Peris,
Scott Mackie,
Kay Rottmann,
Ana Sanchez,
Aaron Nash,
Liam Urbach,
Vishesh Kakarala,
Richa Singh,
Swetha Ranganath,
Laurie Crist,
Misha Britan,
Wouter Leeuwis,
Gokhan Tur,
Prem Natarajan
Abstract:
We present the MASSIVE dataset--Multilingual Amazon Slu resource package (SLURP) for Slot-filling, Intent classification, and Virtual assistant Evaluation. MASSIVE contains 1M realistic, parallel, labeled virtual assistant utterances spanning 51 languages, 18 domains, 60 intents, and 55 slots. MASSIVE was created by tasking professional translators to localize the English-only SLURP dataset into 5…
▽ More
We present the MASSIVE dataset--Multilingual Amazon Slu resource package (SLURP) for Slot-filling, Intent classification, and Virtual assistant Evaluation. MASSIVE contains 1M realistic, parallel, labeled virtual assistant utterances spanning 51 languages, 18 domains, 60 intents, and 55 slots. MASSIVE was created by tasking professional translators to localize the English-only SLURP dataset into 50 typologically diverse languages from 29 genera. We also present modeling results on XLM-R and mT5, including exact match accuracy, intent classification accuracy, and slot-filling F1 score. We have released our dataset, modeling code, and models publicly.
△ Less
Submitted 17 June, 2022; v1 submitted 18 April, 2022;
originally announced April 2022.
-
Grouping Search Results with Product Graphs in E-commerce Platforms
Authors:
Suhas Ranganath,
Shibsankar Das,
Sanjay Thilaivasan,
Shipra Agarwal,
Varun Shrivastava
Abstract:
Showing relevant search results to the user is the primary challenge for any search system. Walmart e-commerce provides an omnichannel search platform to its customers to search from millions of products. This search platform takes a textual query as input and shows relevant items from the catalog. One of the primary challenges is that this queries are complex to understand as it contains multiple…
▽ More
Showing relevant search results to the user is the primary challenge for any search system. Walmart e-commerce provides an omnichannel search platform to its customers to search from millions of products. This search platform takes a textual query as input and shows relevant items from the catalog. One of the primary challenges is that this queries are complex to understand as it contains multiple intent in many cases. This paper proposes a framework to group search results into multiple ranked lists intending to provide better user intent. The framework is to create a product graph having relations between product entities and utilize it to group search results into a series of stacks where each stack provides a group of items based on a precise intent. As an example, for a query "milk," the results can be grouped into multiple stacks of "white milk", "low-fat milk", "almond milk", "flavored milk". We measure the impact of our algorithm by evaluating how it improves the user experience both in terms of search quality relevance and user behavioral signals like Add-To-Cart.
△ Less
Submitted 20 September, 2021;
originally announced September 2021.
-
Signed Link Prediction with Sparse Data: The Role of Personality Information
Authors:
Ghazaleh Beigi,
Suhas Ranganath,
Huan Liu
Abstract:
Predicting signed links in social networks often faces the problem of signed link data sparsity, i.e., only a small percentage of signed links are given. The problem is exacerbated when the number of negative links is much smaller than that of positive links. Boosting signed link prediction necessitates additional information to compensate for data sparsity. According to psychology theories, one r…
▽ More
Predicting signed links in social networks often faces the problem of signed link data sparsity, i.e., only a small percentage of signed links are given. The problem is exacerbated when the number of negative links is much smaller than that of positive links. Boosting signed link prediction necessitates additional information to compensate for data sparsity. According to psychology theories, one rich source of such information is user's personality such as optimism and pessimism that can help determine her propensity in establishing positive and negative links. In this study, we investigate how personality information can be obtained, and if personality information can help alleviate the data sparsity problem for signed link prediction. We propose a novel signed link prediction model that enables empirical exploration of user personality via social media data. We evaluate our proposed model on two datasets of real-world signed link networks. The results demonstrate the complementary role of personality information in the signed link prediction problem. Experimental results also indicate the effectiveness of different levels of personality information for signed link data sparsity problem.
△ Less
Submitted 5 March, 2019;
originally announced March 2019.
-
Leveraging Catalog Knowledge Graphs for Query Attribute Identification in E-Commerce Sites
Authors:
Suhas Ranganath
Abstract:
Millions of people use online e-commerce platforms to search and buy products. Identifying attributes in a query is a critical component in connecting users to relevant items. However, in many cases, the queries have multiple attributes, and some of them will be in conflict with each other. For example, the query "maroon 5 dvds" has two candidate attributes, the color "maroon" or the band "maroon…
▽ More
Millions of people use online e-commerce platforms to search and buy products. Identifying attributes in a query is a critical component in connecting users to relevant items. However, in many cases, the queries have multiple attributes, and some of them will be in conflict with each other. For example, the query "maroon 5 dvds" has two candidate attributes, the color "maroon" or the band "maroon 5", where only one of the attributes can be present. In this paper, we address the problem of resolving conflicting attributes in e-commerce queries. A challenge in this problem is that knowledge bases like Wikipedia that are used to understand web queries are not focused on the e-commerce domain. E-commerce search engines, however, have access to the catalog which contains detailed information about the items and its attributes. We propose a framework that constructs knowledge graphs from catalog to resolve conflicting attributes in e-commerce queries. Our experiments on real-world queries on e-commerce platforms demonstrate that resolving conflicting attributes by leveraging catalog information significantly improves attribute identification, and also gives out more relevant search results.
△ Less
Submitted 13 July, 2018;
originally announced July 2018.
-
Predicting Online Protest Participation of Social Media Users
Authors:
Suhas Ranganath,
Fred Morstatter,
Xia Hu,
Jiliang Tang,
Huan Liu
Abstract:
Social media has emerged to be a popular platform for people to express their viewpoints on political protests like the Arab Spring. Millions of people use social media to communicate and mobilize their viewpoints on protests. Hence, it is a valuable tool for organizing social movements. However, the mechanisms by which protest affects the population is not known, making it difficult to estimate t…
▽ More
Social media has emerged to be a popular platform for people to express their viewpoints on political protests like the Arab Spring. Millions of people use social media to communicate and mobilize their viewpoints on protests. Hence, it is a valuable tool for organizing social movements. However, the mechanisms by which protest affects the population is not known, making it difficult to estimate the number of protestors. In this paper, we are inspired by sociological theories of protest participation and propose a framework to predict from the user's past status messages and interactions whether the next post of the user will be a declaration of protest. Drawing concepts from these theories, we model the interplay between the user's status messages and messages interacting with him over time and predict whether the next post of the user will be a declaration of protest. We evaluate the framework using data from the social media platform Twitter on protests during the recent Nigerian elections and demonstrate that it can effectively predict whether the next post of a user is a declaration of protest.
△ Less
Submitted 9 December, 2015;
originally announced December 2015.
-
Undergraduate Signal Processing Laboratories for the Android Operating System
Authors:
Suhas Ranganath,
JJ Thiagarajan,
KN Ramamurthy,
Shuang Hu,
Mahesh Banavar,
Andreas Spanias
Abstract:
We present a DSP simulation environment that will enable students to perform laboratory exercises using Android mobile devices and tablets. Due to the pervasive nature of the mobile technology, education applications designed for mobile devices have the potential to stimulate student interest in addition to offering convenient access and interaction capabilities. This paper describes a portable si…
▽ More
We present a DSP simulation environment that will enable students to perform laboratory exercises using Android mobile devices and tablets. Due to the pervasive nature of the mobile technology, education applications designed for mobile devices have the potential to stimulate student interest in addition to offering convenient access and interaction capabilities. This paper describes a portable signal processing laboratory for the Android platform. This software is intended to be an educational tool for students and instructors in DSP, and signals and systems courses. The development of Android JDSP (A-JDSP) is carried out using the Android SDK, which is a Java-based open source development platform. The proposed application contains basic DSP functions for convolution, sampling, FFT, filtering and frequency domain analysis, with a convenient graphical user interface. A description of the architecture, functions and planned assessments are presented in this paper.
△ Less
Submitted 24 February, 2015;
originally announced February 2015.
-
Leveraging Social Foci for Information Seeking in Social Media
Authors:
Suhas Ranganath,
Jiliang Tang,
Xia Hu,
Hari Sundaram,
Huan Liu
Abstract:
The rise of social media provides a great opportunity for people to reach out to their social connections to satisfy their information needs. However, generic social media platforms are not explicitly designed to assist information seeking of users. In this paper, we propose a novel framework to identify the social connections of a user able to satisfy his information needs. The information need o…
▽ More
The rise of social media provides a great opportunity for people to reach out to their social connections to satisfy their information needs. However, generic social media platforms are not explicitly designed to assist information seeking of users. In this paper, we propose a novel framework to identify the social connections of a user able to satisfy his information needs. The information need of a social media user is subjective and personal, and we investigate the utility of his social context to identify people able to satisfy it. We present questions users post on Twitter as instances of information seeking activities in social media. We infer soft community memberships of the asker and his social connections by integrating network and content information. Drawing concepts from the social foci theory, we identify answerers who share communities with the asker w.r.t. the question. Our experiments demonstrate that the framework is effective in identifying answerers to social media questions.
△ Less
Submitted 23 February, 2015;
originally announced February 2015.