-
STEAM: Squeeze and Transform Enhanced Attention Module
Authors:
Rishabh Sabharwal,
Ram Samarth B B,
Parikshit Singh Rathore,
Punit Rathore
Abstract:
Channel and spatial attention mechanisms introduced by earlier works enhance the representation abilities of deep convolutional neural networks (CNNs) but often lead to increased parameter and computation costs. While recent approaches focus solely on efficient feature context modeling for channel attention, we aim to model both channel and spatial attention comprehensively with minimal parameters…
▽ More
Channel and spatial attention mechanisms introduced by earlier works enhance the representation abilities of deep convolutional neural networks (CNNs) but often lead to increased parameter and computation costs. While recent approaches focus solely on efficient feature context modeling for channel attention, we aim to model both channel and spatial attention comprehensively with minimal parameters and reduced computation. Leveraging the principles of relational modeling in graphs, we introduce a constant-parameter module, STEAM: Squeeze and Transform Enhanced Attention Module, which integrates channel and spatial attention to enhance the representation power of CNNs. To our knowledge, we are the first to propose a graph-based approach for modeling both channel and spatial attention, utilizing concepts from multi-head graph transformers. Additionally, we introduce Output Guided Pooling (OGP), which efficiently captures spatial context to further enhance spatial attention. We extensively evaluate STEAM for large-scale image classification, object detection and instance segmentation on standard benchmark datasets. STEAM achieves a 2% increase in accuracy over the standard ResNet-50 model with only a meager increase in GFLOPs. Furthermore, STEAM outperforms leading modules ECA and GCT in terms of accuracy while achieving a three-fold reduction in GFLOPs.
△ Less
Submitted 12 December, 2024;
originally announced December 2024.
-
Experimental System Design of an Active Fault-Tolerant Quadrotor
Authors:
Jennifer Yeom,
Roshan Balu T M B,
Guanrui Li,
Giuseppe Loianno
Abstract:
Quadrotors have gained popularity over the last decade, aiding humans in complex tasks such as search and rescue, mapping and exploration. Despite their mechanical simplicity and versatility compared to other types of aerial vehicles, they remain vulnerable to rotor failures. In this paper, we propose an algorithmic and mechanical approach to addressing the quadrotor fault-tolerant problem in case…
▽ More
Quadrotors have gained popularity over the last decade, aiding humans in complex tasks such as search and rescue, mapping and exploration. Despite their mechanical simplicity and versatility compared to other types of aerial vehicles, they remain vulnerable to rotor failures. In this paper, we propose an algorithmic and mechanical approach to addressing the quadrotor fault-tolerant problem in case of rotor failures. First, we present a fault-tolerant detection and control scheme that includes various attitude error metrics. The scheme transitions to a fault-tolerant control mode by surrendering the yaw control. Subsequently, to ensure compatibility with platform sensing constraints, we investigate the relationship between variations in robot rotational drag, achieved through a modular mechanical design appendage, resulting in yaw rates within sensor limits. This analysis offers a platform-agnostic framework for designing more reliable and robust quadrotors in the event of rotor failures. Extensive experimental results validate the proposed approach providing insights into successfully designing a cost-effective quadrotor capable of fault-tolerant control. The overall design enhances safety in scenarios of faulty rotors, without the need for additional sensors or computational resources.
△ Less
Submitted 9 April, 2024;
originally announced April 2024.
-
Recent Innovations in Footwear Sensors: Role of Smart Footwear in Healthcare -- A Survey
Authors:
Pradyumna G. R.,
Roopa B. Hegde,
Bommegowda K. B.,
Anil Kumar Bhat,
Ganesh R. Naik,
Amit N. Pujari
Abstract:
Smart shoes have ushered in a new era of personalised health monitoring and assistive technology. The shoe leverages technologies such as Bluetooth for data collection and wireless transmission and incorporates features such as GPS tracking, obstacle detection, and fitness tracking. This article provides an overview of the current state of smart shoe technology, highlighting the integration of adv…
▽ More
Smart shoes have ushered in a new era of personalised health monitoring and assistive technology. The shoe leverages technologies such as Bluetooth for data collection and wireless transmission and incorporates features such as GPS tracking, obstacle detection, and fitness tracking. This article provides an overview of the current state of smart shoe technology, highlighting the integration of advanced sensors for health monitoring, energy harvesting, assistive features for the visually impaired, and deep learning for data analysis. The study discusses the potential of smart footwear in medical applications, particularly for patients with diabetes, and the ongoing research in this field. Current footwear challenges are also discussed, including complex construction, poor fit, comfort, and high cost.
△ Less
Submitted 6 February, 2024; v1 submitted 3 January, 2024;
originally announced February 2024.
-
AutoCharge: Autonomous Charging for Perpetual Quadrotor Missions
Authors:
Alessandro Saviolo,
Jeffrey Mao,
Roshan Balu T M B,
Vivek Radhakrishnan,
Giuseppe Loianno
Abstract:
Battery endurance represents a key challenge for long-term autonomy and long-range operations, especially in the case of aerial robots. In this paper, we propose AutoCharge, an autonomous charging solution for quadrotors that combines a portable ground station with a flexible, lightweight charging tether and is capable of universal, highly efficient, and robust charging. We design and manufacture…
▽ More
Battery endurance represents a key challenge for long-term autonomy and long-range operations, especially in the case of aerial robots. In this paper, we propose AutoCharge, an autonomous charging solution for quadrotors that combines a portable ground station with a flexible, lightweight charging tether and is capable of universal, highly efficient, and robust charging. We design and manufacture a pair of circular magnetic connectors to ensure a precise orientation-agnostic electrical connection between the ground station and the charging tether. Moreover, we supply the ground station with an electromagnet that largely increases the tolerance to localization and control errors during the docking maneuver, while still guaranteeing smooth un-docking once the charging process is completed. We demonstrate AutoCharge on a perpetual 10 hours quadrotor flight experiment and show that the docking and un-docking performance is solidly repeatable, enabling perpetual quadrotor flight missions.
△ Less
Submitted 8 June, 2023;
originally announced June 2023.
-
A Deep Neural Network Deployment Based on Resistive Memory Accelerator Simulation
Authors:
Tejaswanth Reddy Maram,
Ria Barnwal,
Bindu B
Abstract:
The objective of this study is to illustrate the process of training a Deep Neural Network (DNN) within a Resistive RAM (ReRAM) Crossbar-based simulation environment using CrossSim, an Application Programming Interface (API) developed for this purpose. The CrossSim API is designed to simulate neural networks while taking into account factors that may affect the accuracy of solutions during trainin…
▽ More
The objective of this study is to illustrate the process of training a Deep Neural Network (DNN) within a Resistive RAM (ReRAM) Crossbar-based simulation environment using CrossSim, an Application Programming Interface (API) developed for this purpose. The CrossSim API is designed to simulate neural networks while taking into account factors that may affect the accuracy of solutions during training on non-linear and noisy ReRAM devices. ReRAM-based neural cores that serve as memory accelerators for digital cores on a chip can significantly reduce energy consumption by minimizing data transfers between the processor and SRAM and DRAM. CrossSim employs lookup tables obtained from experimentally derived datasets of real fabricated ReRAM devices to digitally reproduce noisy weight updates to the neural network. The CrossSim directory comprises eight device configurations that operate at different temperatures and are made of various materials. This study aims to analyse the results of training a Neural Network on the Breast Cancer Wisconsin (Diagnostic) dataset using CrossSim, plotting the innercore weight updates and average training and validation loss to investigate the outcomes of all the devices.
△ Less
Submitted 22 April, 2023;
originally announced April 2023.
-
SCARP: 3D Shape Completion in ARbitrary Poses for Improved Grasping
Authors:
Bipasha Sen,
Aditya Agarwal,
Gaurav Singh,
Brojeshwar B.,
Srinath Sridhar,
Madhava Krishna
Abstract:
Recovering full 3D shapes from partial observations is a challenging task that has been extensively addressed in the computer vision community. Many deep learning methods tackle this problem by training 3D shape generation networks to learn a prior over the full 3D shapes. In this training regime, the methods expect the inputs to be in a fixed canonical form, without which they fail to learn a val…
▽ More
Recovering full 3D shapes from partial observations is a challenging task that has been extensively addressed in the computer vision community. Many deep learning methods tackle this problem by training 3D shape generation networks to learn a prior over the full 3D shapes. In this training regime, the methods expect the inputs to be in a fixed canonical form, without which they fail to learn a valid prior over the 3D shapes. We propose SCARP, a model that performs Shape Completion in ARbitrary Poses. Given a partial pointcloud of an object, SCARP learns a disentangled feature representation of pose and shape by relying on rotationally equivariant pose features and geometric shape features trained using a multi-tasking objective. Unlike existing methods that depend on an external canonicalization, SCARP performs canonicalization, pose estimation, and shape completion in a single network, improving the performance by 45% over the existing baselines. In this work, we use SCARP for improving grasp proposals on tabletop objects. By completing partial tabletop objects directly in their observed poses, SCARP enables a SOTA grasp proposal network improve their proposals by 71.2% on partial shapes. Project page: https://bipashasen.github.io/scarp
△ Less
Submitted 17 January, 2023;
originally announced January 2023.
-
BioJam Camp: toward justice through bioengineering and biodesign co-learning with youth
Authors:
Callie Chappell,
Henry A. -A.,
Elvia B. O.,
Emily B.,
Bailey B.,
Jacqueline C. -M.,
Caroline Daws,
Cristian F.,
Emiliano G.,
Page Goddard,
Xavier G.,
Anne Hu,
Gabriela J.,
Kelley Langhans,
Briana Martin-Villa,
Penny M. -S.,
Jennifer M.,
Soyang N.,
Melissa Ortiz,
Aryana P.,
Trisha S,
Corinne Takara,
Emily T.,
Paloma Vazquez,
Rolando Perez
, et al. (1 additional authors not shown)
Abstract:
BioJam is a political, artistic, and educational project in which Bay Area artists, scientists, and educators collaborate with youth and communities of color to address historical exclusion of their communities in STEM fields and reframe what science can be. As an intergenerational collective, we co-learn on topics of culture (social and biological), community (cultural and ecological), and creati…
▽ More
BioJam is a political, artistic, and educational project in which Bay Area artists, scientists, and educators collaborate with youth and communities of color to address historical exclusion of their communities in STEM fields and reframe what science can be. As an intergenerational collective, we co-learn on topics of culture (social and biological), community (cultural and ecological), and creativity. We reject the notion that increasing the number of scientists of color requires inculcation in the ways of the dominant culture. Instead, we center cultural practices, traditional ways of knowing, storytelling, art, experiential learning, and community engagement to break down the framing that positions these practices as distinct from science. The goal of this work is to realize a future in which the practice of science is relatable, accessible, and liberatory.
△ Less
Submitted 1 November, 2022;
originally announced November 2022.
-
IIITT@Dravidian-CodeMix-FIRE2021: Transliterate or translate? Sentiment analysis of code-mixed text in Dravidian languages
Authors:
Karthik Puranik,
Bharathi B,
Senthil Kumar B
Abstract:
Sentiment analysis of social media posts and comments for various marketing and emotional purposes is gaining recognition. With the increasing presence of code-mixed content in various native languages, there is a need for ardent research to produce promising results. This research paper bestows a tiny contribution to this research in the form of sentiment analysis of code-mixed social media comme…
▽ More
Sentiment analysis of social media posts and comments for various marketing and emotional purposes is gaining recognition. With the increasing presence of code-mixed content in various native languages, there is a need for ardent research to produce promising results. This research paper bestows a tiny contribution to this research in the form of sentiment analysis of code-mixed social media comments in the popular Dravidian languages Kannada, Tamil and Malayalam. It describes the work for the shared task conducted by Dravidian-CodeMix at FIRE 2021 by employing pre-trained models like ULMFiT and multilingual BERT fine-tuned on the code-mixed dataset, transliteration (TRAI) of the same, English translations (TRAA) of the TRAI data and the combination of all the three. The results are recorded in this research paper where the best models stood 4th, 5th and 10th ranks in the Tamil, Kannada and Malayalam tasks respectively.
△ Less
Submitted 15 November, 2021;
originally announced November 2021.
-
Neural Abstractive Text Summarizer for Telugu Language
Authors:
Mohan Bharath B,
Aravindh Gowtham B,
Akhil M
Abstract:
Abstractive Text Summarization is the process of constructing semantically relevant shorter sentences which captures the essence of the overall meaning of the source text. It is actually difficult and very time consuming for humans to summarize manually large documents of text. Much of work in abstractive text summarization is being done in English and almost no significant work has been reported…
▽ More
Abstractive Text Summarization is the process of constructing semantically relevant shorter sentences which captures the essence of the overall meaning of the source text. It is actually difficult and very time consuming for humans to summarize manually large documents of text. Much of work in abstractive text summarization is being done in English and almost no significant work has been reported in Telugu abstractive text summarization. So, we would like to propose an abstractive text summarization approach for Telugu language using Deep learning. In this paper we are proposing an abstractive text summarization Deep learning model for Telugu language. The proposed architecture is based on encoder-decoder sequential models with attention mechanism. We have applied this model on manually created dataset to generate a one sentence summary of the source text and have got good results measured qualitatively.
△ Less
Submitted 18 January, 2021;
originally announced January 2021.
-
Deception and the Strategy of Influence
Authors:
Brian B.,
William Fleshman,
Kevin H.,
Ryan Kaliszewski,
Shawn R
Abstract:
Organizations have long used deception as a means to exert influence in pursuit of their agendas. In particular, information operations such as propaganda distribution, support of antigovernment protest, and revelation of politically and socially damaging secrets were abundant during World War II and the Cold War. A key component of each of these efforts is deceiving the targets by obscuring inten…
▽ More
Organizations have long used deception as a means to exert influence in pursuit of their agendas. In particular, information operations such as propaganda distribution, support of antigovernment protest, and revelation of politically and socially damaging secrets were abundant during World War II and the Cold War. A key component of each of these efforts is deceiving the targets by obscuring intent and identity. Information from a trusted source is more influential than information from an adversary and therefore more likely to sway opinions. The ubiquitous adoption of social media, characterized by user-generated and peer disseminated content, has notably increased the frequency, scale, and efficacy of influence operations worldwide. In this article, we explore how methods of deception including audience building, media hijacking, and community subversion inform the techniques and tradecraft of today's influence operators. We then discuss how a properly equipped and informed public can diagnose and counter malign influence operations.
△ Less
Submitted 2 November, 2020;
originally announced November 2020.
-
COVID-19 Classification Using Staked Ensembles: A Comprehensive Analysis
Authors:
Lalith Bharadwaj B,
Rohit Boddeda,
Sai Vardhan K,
Madhu G
Abstract:
The issue of COVID-19, increasing with a massive mortality rate. This led to the WHO declaring it as a pandemic. In this situation, it is crucial to perform efficient and fast diagnosis. The reverse transcript polymerase chain reaction (RTPCR) test is conducted to detect the presence of SARS-CoV-2. This test is time-consuming and instead chest CT (or Chest X-ray) can be used for a fast and accurat…
▽ More
The issue of COVID-19, increasing with a massive mortality rate. This led to the WHO declaring it as a pandemic. In this situation, it is crucial to perform efficient and fast diagnosis. The reverse transcript polymerase chain reaction (RTPCR) test is conducted to detect the presence of SARS-CoV-2. This test is time-consuming and instead chest CT (or Chest X-ray) can be used for a fast and accurate diagnosis. Automated diagnosis is considered to be important as it reduces human effort and provides accurate and low-cost tests. The contributions of our research are three-fold. First, it is aimed to analyse the behaviour and performance of variant vision models ranging from Inception to NAS networks with the appropriate fine-tuning procedure. Second, the behaviour of these models is visually analysed by plotting CAMs for individual networks and determining classification performance with AUCROC curves. Thirdly, stacked ensembles techniques are imparted to provide higher generalisation on combining the fine-tuned models, in which six ensemble neural networks are designed by combining the existing fine-tuned networks. Implying these stacked ensembles provides a great generalization to the models. The ensemble model designed by combining all the fine-tuned networks obtained a state-of-the-art accuracy score of 99.17%. The precision and recall for the COVID-19 class are 99.99% and 89.79% respectively, which resembles the robustness of the stacked ensembles.
△ Less
Submitted 7 August, 2021; v1 submitted 7 October, 2020;
originally announced October 2020.
-
Regression-based music emotion prediction using triplet neural networks
Authors:
Kin Wai Cheuk,
Yin-Jyun Luo,
Balamurali B,
T,
Gemma Roig,
Dorien Herremans
Abstract:
In this paper, we adapt triplet neural networks (TNNs) to a regression task, music emotion prediction. Since TNNs were initially introduced for classification, and not for regression, we propose a mechanism that allows them to provide meaningful low dimensional representations for regression tasks. We then use these new representations as the input for regression algorithms such as support vector…
▽ More
In this paper, we adapt triplet neural networks (TNNs) to a regression task, music emotion prediction. Since TNNs were initially introduced for classification, and not for regression, we propose a mechanism that allows them to provide meaningful low dimensional representations for regression tasks. We then use these new representations as the input for regression algorithms such as support vector machines and gradient boosting machines. To demonstrate the TNNs' effectiveness at creating meaningful representations, we compare them to different dimensionality reduction methods on music emotion prediction, i.e., predicting valence and arousal values from musical audio signals. Our results on the DEAM dataset show that by using TNNs we achieve 90% feature dimensionality reduction with a 9% improvement in valence prediction and 4% improvement in arousal prediction with respect to our baseline models (without TNN). Our TNN method outperforms other dimensionality reduction methods such as principal component analysis (PCA) and autoencoders (AE). This shows that, in addition to providing a compact latent space representation of audio features, the proposed approach has a higher performance than the baseline models.
△ Less
Submitted 21 July, 2020; v1 submitted 24 January, 2020;
originally announced January 2020.
-
Deep Learning for Digital Text Analytics: Sentiment Analysis
Authors:
Reshma U,
Barathi Ganesh H B,
Mandar Kale,
Prachi Mankame,
Gouri Kulkarni
Abstract:
In today's scenario, imagining a world without negativity is something very unrealistic, as bad NEWS spreads more virally than good ones. Though it seems impractical in real life, this could be implemented by building a system using Machine Learning and Natural Language Processing techniques in identifying the news datum with negative shade and filter them by taking only the news with positive sha…
▽ More
In today's scenario, imagining a world without negativity is something very unrealistic, as bad NEWS spreads more virally than good ones. Though it seems impractical in real life, this could be implemented by building a system using Machine Learning and Natural Language Processing techniques in identifying the news datum with negative shade and filter them by taking only the news with positive shade (good news) to the end user. In this work, around two lakhs datum have been trained and tested using a combination of rule-based and data driven approaches. VADER along with a filtration method has been used as an annotating tool followed by statistical Machine Learning approach that have used Document Term Matrix (representation) and Support Vector Machine (classification). Deep Learning algorithms then came into picture to make this system reliable (Doc2Vec) which finally ended up with Convolutional Neural Network(CNN) that yielded better results than the other experimented modules. It showed up a training accuracy of 96%, while a test accuracy of (internal and external news datum) above 85% was obtained.
△ Less
Submitted 10 April, 2018;
originally announced April 2018.
-
A Survey of Voice Translation Methodologies - Acoustic Dialect Decoder
Authors:
Hans Krupakar,
Keerthika Rajvel,
Bharathi B,
Angel Deborah S,
Vallidevi Krishnamurthy
Abstract:
Speech Translation has always been about giving source text or audio input and waiting for system to give translated output in desired form. In this paper, we present the Acoustic Dialect Decoder (ADD) - a voice to voice ear-piece translation device. We introduce and survey the recent advances made in the field of Speech Engineering, to employ in the ADD, particularly focusing on the three major p…
▽ More
Speech Translation has always been about giving source text or audio input and waiting for system to give translated output in desired form. In this paper, we present the Acoustic Dialect Decoder (ADD) - a voice to voice ear-piece translation device. We introduce and survey the recent advances made in the field of Speech Engineering, to employ in the ADD, particularly focusing on the three major processing steps of Recognition, Translation and Synthesis. We tackle the problem of machine understanding of natural language by designing a recognition unit for source audio to text, a translation unit for source language text to target language text, and a synthesis unit for target language text to target language speech. Speech from the surroundings will be recorded by the recognition unit present on the ear-piece and translation will start as soon as one sentence is successfully read. This way, we hope to give translated output as and when input is being read. The recognition unit will use Hidden Markov Models (HMMs) Based Tool-Kit (HTK), hybrid RNN systems with gated memory cells, and the synthesis unit, HMM based speech synthesis system HTS. This system will initially be built as an English to Tamil translation device.
△ Less
Submitted 13 October, 2016;
originally announced October 2016.
-
Quantitative methods for Phylogenetic Inference in Historical Linguistics: An experimental case study of South Central Dravidian
Authors:
Taraka Rama,
Sudheer Kolachina,
Lakshmi Bai B
Abstract:
In this paper we examine the usefulness of two classes of algorithms Distance Methods, Discrete Character Methods (Felsenstein and Felsenstein 2003) widely used in genetics, for predicting the family relationships among a set of related languages and therefore, diachronic language change. Applying these algorithms to the data on the numbers of shared cognates- with-change and changed as well as un…
▽ More
In this paper we examine the usefulness of two classes of algorithms Distance Methods, Discrete Character Methods (Felsenstein and Felsenstein 2003) widely used in genetics, for predicting the family relationships among a set of related languages and therefore, diachronic language change. Applying these algorithms to the data on the numbers of shared cognates- with-change and changed as well as unchanged cognates for a group of six languages belonging to a Dravidian language sub-family given in Krishnamurti et al. (1983), we observed that the resultant phylogenetic trees are largely in agreement with the linguistic family tree constructed using the comparative method of reconstruction with only a few minor differences. Furthermore, we studied these minor differences and found that they were cases of genuine ambiguity even for a well-trained historical linguist. We evaluated the trees obtained through our experiments using a well-defined criterion and report the results here. We finally conclude that quantitative methods like the ones we examined are quite useful in predicting family relationships among languages. In addition, we conclude that a modest degree of confidence attached to the intuition that there could indeed exist a parallelism between the processes of linguistic and genetic change is not totally misplaced.
△ Less
Submitted 3 January, 2014;
originally announced January 2014.