-
A Comprehensive Survey of Accelerated Generation Techniques in Large Language Models
Authors:
Mahsa Khoshnoodi,
Vinija Jain,
Mingye Gao,
Malavika Srikanth,
Aman Chadha
Abstract:
Despite the crucial importance of accelerating text generation in large language models (LLMs) for efficiently producing content, the sequential nature of this process often leads to high inference latency, posing challenges for real-time applications. Various techniques have been proposed and developed to address these challenges and improve efficiency. This paper presents a comprehensive survey…
▽ More
Despite the crucial importance of accelerating text generation in large language models (LLMs) for efficiently producing content, the sequential nature of this process often leads to high inference latency, posing challenges for real-time applications. Various techniques have been proposed and developed to address these challenges and improve efficiency. This paper presents a comprehensive survey of accelerated generation techniques in autoregressive language models, aiming to understand the state-of-the-art methods and their applications. We categorize these techniques into several key areas: speculative decoding, early exiting mechanisms, and non-autoregressive methods. We discuss each category's underlying principles, advantages, limitations, and recent advancements. Through this survey, we aim to offer insights into the current landscape of techniques in LLMs and provide guidance for future research directions in this critical area of natural language processing.
△ Less
Submitted 24 May, 2024; v1 submitted 15 May, 2024;
originally announced May 2024.
-
SPRING-INX: A Multilingual Indian Language Speech Corpus by SPRING Lab, IIT Madras
Authors:
Nithya R,
Malavika S,
Jordan F,
Arjun Gangwar,
Metilda N J,
S Umesh,
Rithik Sarab,
Akhilesh Kumar Dubey,
Govind Divakaran,
Samudra Vijaya K,
Suryakanth V Gangashetty
Abstract:
India is home to a multitude of languages of which 22 languages are recognised by the Indian Constitution as official. Building speech based applications for the Indian population is a difficult problem owing to limited data and the number of languages and accents to accommodate. To encourage the language technology community to build speech based applications in Indian languages, we are open sour…
▽ More
India is home to a multitude of languages of which 22 languages are recognised by the Indian Constitution as official. Building speech based applications for the Indian population is a difficult problem owing to limited data and the number of languages and accents to accommodate. To encourage the language technology community to build speech based applications in Indian languages, we are open sourcing SPRING-INX data which has about 2000 hours of legally sourced and manually transcribed speech data for ASR system building in Assamese, Bengali, Gujarati, Hindi, Kannada, Malayalam, Marathi, Odia, Punjabi and Tamil. This endeavor is by SPRING Lab , Indian Institute of Technology Madras and is a part of National Language Translation Mission (NLTM), funded by the Indian Ministry of Electronics and Information Technology (MeitY), Government of India. We describe the data collection and data cleaning process along with the data statistics in this paper.
△ Less
Submitted 24 October, 2023; v1 submitted 23 October, 2023;
originally announced October 2023.
-
Controlled and Conditional Text to Image Generation with Diffusion Prior
Authors:
Pranav Aggarwal,
Hareesh Ravi,
Naveen Marri,
Sachin Kelkar,
Fengbin Chen,
Vinh Khuc,
Midhun Harikumar,
Ritiz Tambi,
Sudharshan Reddy Kakumanu,
Purvak Lapsiya,
Alvin Ghouas,
Sarah Saber,
Malavika Ramprasad,
Baldo Faieta,
Ajinkya Kale
Abstract:
Denoising Diffusion models have shown remarkable performance in generating diverse, high quality images from text. Numerous techniques have been proposed on top of or in alignment with models like Stable Diffusion and Imagen that generate images directly from text. A lesser explored approach is DALLE-2's two step process comprising a Diffusion Prior that generates a CLIP image embedding from text…
▽ More
Denoising Diffusion models have shown remarkable performance in generating diverse, high quality images from text. Numerous techniques have been proposed on top of or in alignment with models like Stable Diffusion and Imagen that generate images directly from text. A lesser explored approach is DALLE-2's two step process comprising a Diffusion Prior that generates a CLIP image embedding from text and a Diffusion Decoder that generates an image from a CLIP image embedding. We explore the capabilities of the Diffusion Prior and the advantages of an intermediate CLIP representation. We observe that Diffusion Prior can be used in a memory and compute efficient way to constrain the generation to a specific domain without altering the larger Diffusion Decoder. Moreover, we show that the Diffusion Prior can be trained with additional conditional information such as color histogram to further control the generation. We show quantitatively and qualitatively that the proposed approaches perform better than prompt engineering for domain specific generation and existing baselines for color conditioned generation. We believe that our observations and results will instigate further research into the diffusion prior and uncover more of its capabilities.
△ Less
Submitted 1 August, 2023; v1 submitted 22 February, 2023;
originally announced February 2023.
-
Physics-informed Neural Networks approach to solve the Blasius function
Authors:
Greeshma Krishna,
Malavika S Nair,
Pramod P Nair,
Anil Lal S
Abstract:
Deep learning techniques with neural networks have been used effectively in computational fluid dynamics (CFD) to obtain solutions to nonlinear differential equations. This paper presents a physics-informed neural network (PINN) approach to solve the Blasius function. This method eliminates the process of changing the non-linear differential equation to an initial value problem. Also, it tackles t…
▽ More
Deep learning techniques with neural networks have been used effectively in computational fluid dynamics (CFD) to obtain solutions to nonlinear differential equations. This paper presents a physics-informed neural network (PINN) approach to solve the Blasius function. This method eliminates the process of changing the non-linear differential equation to an initial value problem. Also, it tackles the convergence issue arising in the conventional series solution. It is seen that this method produces results that are at par with the numerical and conventional methods. The solution is extended to the negative axis to show that PINNs capture the singularity of the function at $η=-5.69$
△ Less
Submitted 5 February, 2023; v1 submitted 30 December, 2022;
originally announced January 2023.
-
Detecting Contradictory COVID-19 Drug Efficacy Claims from Biomedical Literature
Authors:
Daniel N. Sosa,
Malavika Suresh,
Christopher Potts,
Russ B. Altman
Abstract:
The COVID-19 pandemic created a deluge of questionable and contradictory scientific claims about drug efficacy -- an "infodemic" with lasting consequences for science and society. In this work, we argue that NLP models can help domain experts distill and understand the literature in this complex, high-stakes area. Our task is to automatically identify contradictory claims about COVID-19 drug effic…
▽ More
The COVID-19 pandemic created a deluge of questionable and contradictory scientific claims about drug efficacy -- an "infodemic" with lasting consequences for science and society. In this work, we argue that NLP models can help domain experts distill and understand the literature in this complex, high-stakes area. Our task is to automatically identify contradictory claims about COVID-19 drug efficacy. We frame this as a natural language inference problem and offer a new NLI dataset created by domain experts. The NLI framing allows us to create curricula combining existing datasets and our own. The resulting models are useful investigative tools. We provide a case study of how these models help a domain expert summarize and assess evidence concerning remdisivir and hydroxychloroquine.
△ Less
Submitted 19 December, 2022;
originally announced December 2022.
-
Secure and Safety Mobile Network System for Visually Impaired People
Authors:
Shyama Kumari Arunachalam,
Roopa V,
Meena H B,
Vijayalakshmi,
T Malavika
Abstract:
The proposed system aims to be a techno-friend of visually impaired people to assist them in orientation and mobility both indoor and outdoor. Moving through an unknown environment becomes a real challenge for most of them, although they rely on their other senses. An age old mechanism used for assistance for the blind people is a white cane commonly known as walking cane a simple and purely mecha…
▽ More
The proposed system aims to be a techno-friend of visually impaired people to assist them in orientation and mobility both indoor and outdoor. Moving through an unknown environment becomes a real challenge for most of them, although they rely on their other senses. An age old mechanism used for assistance for the blind people is a white cane commonly known as walking cane a simple and purely mechanical device to detect the ground, uneven surfaces, holes and steps using simple Tactile-force feedback.
△ Less
Submitted 1 December, 2021;
originally announced December 2021.
-
Searching for Replacement Classes
Authors:
Malavika Samak,
Jose Pablo Cambronero,
Martin C. Rinard
Abstract:
Software developers must often replace existing components in their systems to adapt to evolving environments or tooling. While traditional code search systems are effective at retrieving components with related functionality, it is much more challenging to retrieve components that can be used to directly replace existing functionality, as replacements must account for more fundamental program pro…
▽ More
Software developers must often replace existing components in their systems to adapt to evolving environments or tooling. While traditional code search systems are effective at retrieving components with related functionality, it is much more challenging to retrieve components that can be used to directly replace existing functionality, as replacements must account for more fundamental program properties such as type compatibility. To address this problem, we introduce ClassFinder, a system which given a query class Q, and a search corpus S, returns a ranked subset of classes that can replace Q and its functionality. ClassFinder produces afield and method mapping between the classes that can provide useful hints to a developer and can be used to effectively refine the ranking of candidate replacement classes. Our technique leverages the complementary strengths of a distributed embeddings-based search and type-based analysis, using the former to prune down candidates for an optimization-based approach based on the latter. ClassFinder retrieves replacement classes, along with a type-aware field/method mapping between classes. We evaluate ClassFinder on a search space of ~600thousand open sourceJava classes. Querying ClassFinder with 24 known Java classes provided meaningful replacement classes and mappings, in many cases producing complete mappings with functionally identical replacement classes.
△ Less
Submitted 11 October, 2021;
originally announced October 2021.
-
Adversarial Learning for Zero-Shot Stance Detection on Social Media
Authors:
Emily Allaway,
Malavika Srikanth,
Kathleen McKeown
Abstract:
Stance detection on social media can help to identify and understand slanted news or commentary in everyday life. In this work, we propose a new model for zero-shot stance detection on Twitter that uses adversarial learning to generalize across topics. Our model achieves state-of-the-art performance on a number of unseen test topics with minimal computational costs. In addition, we extend zero-sho…
▽ More
Stance detection on social media can help to identify and understand slanted news or commentary in everyday life. In this work, we propose a new model for zero-shot stance detection on Twitter that uses adversarial learning to generalize across topics. Our model achieves state-of-the-art performance on a number of unseen test topics with minimal computational costs. In addition, we extend zero-shot stance detection to new topics, highlighting future directions for zero-shot transfer.
△ Less
Submitted 13 May, 2021;
originally announced May 2021.
-
Vehicle Trajectory Prediction by Transfer Learning of Semi-Supervised Models
Authors:
Nick Lamm,
Shashank Jaiprakash,
Malavika Srikanth,
Iddo Drori
Abstract:
In this work we show that semi-supervised models for vehicle trajectory prediction significantly improve performance over supervised models on state-of-the-art real-world benchmarks. Moving from supervised to semi-supervised models allows scaling-up by using unlabeled data, increasing the number of images in pre-training from Millions to a Billion. We perform ablation studies comparing transfer le…
▽ More
In this work we show that semi-supervised models for vehicle trajectory prediction significantly improve performance over supervised models on state-of-the-art real-world benchmarks. Moving from supervised to semi-supervised models allows scaling-up by using unlabeled data, increasing the number of images in pre-training from Millions to a Billion. We perform ablation studies comparing transfer learning of semi-supervised and supervised models while keeping all other factors equal. Within semi-supervised models we compare contrastive learning with teacher-student methods as well as networks predicting a small number of trajectories with networks predicting probabilities over a large trajectory set. Our results using both low-level and mid-level representations of the driving environment demonstrate the applicability of semi-supervised methods for real-world vehicle trajectory prediction.
△ Less
Submitted 9 October, 2020; v1 submitted 13 July, 2020;
originally announced July 2020.
-
An operational architecture for privacy-by-design in public service applications
Authors:
Prashant Agrawal,
Anubhutie Singh,
Malavika Raghavan,
Subodh Sharma,
Subhashis Banerjee
Abstract:
Governments around the world are trying to build large data registries for effective delivery of a variety of public services. However, these efforts are often undermined due to serious concerns over privacy risks associated with collection and processing of personally identifiable information. While a rich set of special-purpose privacy-preserving techniques exist in computer science, they are un…
▽ More
Governments around the world are trying to build large data registries for effective delivery of a variety of public services. However, these efforts are often undermined due to serious concerns over privacy risks associated with collection and processing of personally identifiable information. While a rich set of special-purpose privacy-preserving techniques exist in computer science, they are unable to provide end-to-end protection in alignment with legal principles in the absence of an overarching operational architecture to ensure purpose limitation and protection against insider attacks. This either leads to weak privacy protection in large designs, or adoption of overly defensive strategies to protect privacy by compromising on utility.
In this paper, we present an operational architecture for privacy-by-design based on independent regulatory oversight stipulated by most data protection regimes, regulated access control, purpose limitation and data minimisation. We briefly discuss the feasibility of implementing our architecture based on existing techniques. We also present some sample case studies of privacy-preserving design sketches of challenging public service applications.
△ Less
Submitted 8 June, 2020;
originally announced June 2020.
-
Image to Language Understanding: Captioning approach
Authors:
Madhavan Seshadri,
Malavika Srikanth,
Mikhail Belov
Abstract:
Extracting context from visual representations is of utmost importance in the advancement of Computer Science. Representation of such a format in Natural Language has a huge variety of applications such as helping the visually impaired etc. Such an approach is a combination of Computer Vision and Natural Language techniques which is a hard problem to solve. This project aims to compare different a…
▽ More
Extracting context from visual representations is of utmost importance in the advancement of Computer Science. Representation of such a format in Natural Language has a huge variety of applications such as helping the visually impaired etc. Such an approach is a combination of Computer Vision and Natural Language techniques which is a hard problem to solve. This project aims to compare different approaches for solving the image captioning problem. In specific, the focus was on comparing two different types of models: Encoder-Decoder approach and a Multi-model approach. In the encoder-decoder approach, inject and merge architectures were compared against a multi-modal image captioning approach based primarily on object detection. These approaches have been compared on the basis on state of the art sentence comparison metrics such as BLEU, GLEU, Meteor, and Rouge on a subset of the Google Conceptual captions dataset which contains 100k images. On the basis of this comparison, we observed that the best model was the Inception injected encoder model. This best approach has been deployed as a web-based system. On uploading an image, such a system will output the best caption associated with the image.
△ Less
Submitted 21 February, 2020;
originally announced February 2020.
-
Debunking the Myth that Upfront Requirements are Infeasible for Scientific Computing Software
Authors:
Spencer Smith,
Malavika Srinivasan,
Sumanth Shankar
Abstract:
Many in the Scientific Computing Software community believe that upfront requirements are impossible, or at least infeasible. This paper shows requirements are feasible with the following: i) an appropriate perspective ("faking" the final documentation as if requirements were correct and complete from the start, and gathering requirements as if for a family of programs); ii) the aid of the right p…
▽ More
Many in the Scientific Computing Software community believe that upfront requirements are impossible, or at least infeasible. This paper shows requirements are feasible with the following: i) an appropriate perspective ("faking" the final documentation as if requirements were correct and complete from the start, and gathering requirements as if for a family of programs); ii) the aid of the right principles (abstraction, separation of concerns, anticipation of change, and generality); iii) employing SCS specific templates (for Software Requirements and Module Interface Specification); iv) using a design process that enables change (information hiding); and, v) the aid of modern tools (version control, issue tracking, checking, generation and automation tools). Not only are upfront requirements feasible, they provide significant benefits, including facilitating communication, early identification of errors, better design decisions and enabling replicability. The topics listed above are explained, justified and illustrated via an example of software developed by a small team of software and mechanical engineers for modelling the solidification of a metal alloy.
△ Less
Submitted 18 June, 2019;
originally announced June 2019.