Search | arXiv e-print repository

Adaptive Object Detection with ESRGAN-Enhanced Resolution & Faster R-CNN

Authors: Divya Swetha K, Ziaul Haque Choudhury, Hemanta Kumar Bhuyan, Biswajit Brahma, Nilayam Kumar Kamila

Abstract: In this study, proposes a method for improved object detection from the low-resolution images by integrating Enhanced Super-Resolution Generative Adversarial Networks (ESRGAN) and Faster Region-Convolutional Neural Network (Faster R-CNN). ESRGAN enhances low-quality images, restoring details and improving clarity, while Faster R-CNN performs accurate object detection on the enhanced images. The co… ▽ More In this study, proposes a method for improved object detection from the low-resolution images by integrating Enhanced Super-Resolution Generative Adversarial Networks (ESRGAN) and Faster Region-Convolutional Neural Network (Faster R-CNN). ESRGAN enhances low-quality images, restoring details and improving clarity, while Faster R-CNN performs accurate object detection on the enhanced images. The combination of these techniques ensures better detection performance, even with poor-quality inputs, offering an effective solution for applications where image resolution is in consistent. ESRGAN is employed as a pre-processing step to enhance the low-resolution input image, effectively restoring lost details and improving overall image quality. Subsequently, the enhanced image is fed into the Faster R-CNN model for accurate object detection and localization. Experimental results demonstrate that this integrated approach yields superior performance compared to traditional methods applied directly to low-resolution images. The proposed framework provides a promising solution for applications where image quality is variable or limited, enabling more robust and reliable object detection in challenging scenarios. It achieves a balance between improved image quality and efficient object detection △ Less

Submitted 10 June, 2025; originally announced June 2025.

arXiv:2505.08685 [pdf, ps, other]

Calibration and Uncertainty for multiRater Volume Assessment in multiorgan Segmentation (CURVAS) challenge results

Authors: Meritxell Riera-Marin, Sikha O K, Julia Rodriguez-Comas, Matthias Stefan May, Zhaohong Pan, Xiang Zhou, Xiaokun Liang, Franciskus Xaverius Erick, Andrea Prenner, Cedric Hemon, Valentin Boussot, Jean-Louis Dillenseger, Jean-Claude Nunes, Abdul Qayyum, Moona Mazher, Steven A Niederer, Kaisar Kushibar, Carlos Martin-Isla, Petia Radeva, Karim Lekadir, Theodore Barfoot, Luis C. Garcia Peraza Herrera, Ben Glocker, Tom Vercauteren, Lucas Gago , et al. (7 additional authors not shown)

Abstract: Deep learning (DL) has become the dominant approach for medical image segmentation, yet ensuring the reliability and clinical applicability of these models requires addressing key challenges such as annotation variability, calibration, and uncertainty estimation. This is why we created the Calibration and Uncertainty for multiRater Volume Assessment in multiorgan Segmentation (CURVAS), which highl… ▽ More Deep learning (DL) has become the dominant approach for medical image segmentation, yet ensuring the reliability and clinical applicability of these models requires addressing key challenges such as annotation variability, calibration, and uncertainty estimation. This is why we created the Calibration and Uncertainty for multiRater Volume Assessment in multiorgan Segmentation (CURVAS), which highlights the critical role of multiple annotators in establishing a more comprehensive ground truth, emphasizing that segmentation is inherently subjective and that leveraging inter-annotator variability is essential for robust model evaluation. Seven teams participated in the challenge, submitting a variety of DL models evaluated using metrics such as Dice Similarity Coefficient (DSC), Expected Calibration Error (ECE), and Continuous Ranked Probability Score (CRPS). By incorporating consensus and dissensus ground truth, we assess how DL models handle uncertainty and whether their confidence estimates align with true segmentation performance. Our findings reinforce the importance of well-calibrated models, as better calibration is strongly correlated with the quality of the results. Furthermore, we demonstrate that segmentation models trained on diverse datasets and enriched with pre-trained knowledge exhibit greater robustness, particularly in cases deviating from standard anatomical structures. Notably, the best-performing models achieved high DSC and well-calibrated uncertainty estimates. This work underscores the need for multi-annotator ground truth, thorough calibration assessments, and uncertainty-aware evaluations to develop trustworthy and clinically reliable DL-based medical image segmentation models. △ Less

Submitted 13 May, 2025; originally announced May 2025.

Comments: This challenge was hosted in MICCAI 2024

arXiv:2505.07782 [pdf, ps, other]

MLE-Dojo: Interactive Environments for Empowering LLM Agents in Machine Learning Engineering

Authors: Rushi Qiang, Yuchen Zhuang, Yinghao Li, Dingu Sagar V K, Rongzhi Zhang, Changhao Li, Ian Shu-Hei Wong, Sherry Yang, Percy Liang, Chao Zhang, Bo Dai

Abstract: We introduce MLE-Dojo, a Gym-style framework for systematically reinforcement learning, evaluating, and improving autonomous large language model (LLM) agents in iterative machine learning engineering (MLE) workflows. Unlike existing benchmarks that primarily rely on static datasets or single-attempt evaluations, MLE-Dojo provides an interactive environment enabling agents to iteratively experimen… ▽ More We introduce MLE-Dojo, a Gym-style framework for systematically reinforcement learning, evaluating, and improving autonomous large language model (LLM) agents in iterative machine learning engineering (MLE) workflows. Unlike existing benchmarks that primarily rely on static datasets or single-attempt evaluations, MLE-Dojo provides an interactive environment enabling agents to iteratively experiment, debug, and refine solutions through structured feedback loops. Built upon 200+ real-world Kaggle challenges, MLE-Dojo covers diverse, open-ended MLE tasks carefully curated to reflect realistic engineering scenarios such as data processing, architecture search, hyperparameter tuning, and code debugging. Its fully executable environment supports comprehensive agent training via both supervised fine-tuning and reinforcement learning, facilitating iterative experimentation, realistic data sampling, and real-time outcome verification. Extensive evaluations of eight frontier LLMs reveal that while current models achieve meaningful iterative improvements, they still exhibit significant limitations in autonomously generating long-horizon solutions and efficiently resolving complex errors. Furthermore, MLE-Dojo's flexible and extensible architecture seamlessly integrates diverse data sources, tools, and evaluation protocols, uniquely enabling model-based agent tuning and promoting interoperability, scalability, and reproducibility. We open-source our framework and benchmarks to foster community-driven innovation towards next-generation MLE agents. △ Less

Submitted 12 May, 2025; originally announced May 2025.

arXiv:2505.00137 [pdf, other]

Toward Practical Quantum Machine Learning: A Novel Hybrid Quantum LSTM for Fraud Detection

Authors: Rushikesh Ubale, Sujan K. K., Sangram Deshpande, Gregory T. Byrd

Abstract: We present a novel hybrid quantum-classical neural network architecture for fraud detection that integrates a classical Long Short-Term Memory (LSTM) network with a variational quantum circuit. By leveraging quantum phenomena such as superposition and entanglement, our model enhances the feature representation of sequential transaction data, capturing complex non-linear patterns that are challengi… ▽ More We present a novel hybrid quantum-classical neural network architecture for fraud detection that integrates a classical Long Short-Term Memory (LSTM) network with a variational quantum circuit. By leveraging quantum phenomena such as superposition and entanglement, our model enhances the feature representation of sequential transaction data, capturing complex non-linear patterns that are challenging for purely classical models. A comprehensive data preprocessing pipeline is employed to clean, encode, balance, and normalize a credit card fraud dataset, ensuring a fair comparison with baseline models. Notably, our hybrid approach achieves per-epoch training times in the range of 45-65 seconds, which is significantly faster than similar architectures reported in the literature, where training typically requires several minutes per epoch. Both classical and quantum gradients are jointly optimized via a unified backpropagation procedure employing the parameter-shift rule for the quantum parameters. Experimental evaluations demonstrate competitive improvements in accuracy, precision, recall, and F1 score relative to a conventional LSTM baseline. These results underscore the promise of hybrid quantum-classical techniques in advancing the efficiency and performance of fraud detection systems. Keywords: Hybrid Quantum-Classical Neural Networks, Quantum Computing, Fraud Detection, Hybrid Quantum LSTM, Variational Quantum Circuit, Parameter-Shift Rule, Financial Risk Analysis △ Less

Submitted 30 April, 2025; originally announced May 2025.

Comments: 11 pages ,8 figures

arXiv:2503.18669 [pdf, other]

A Comprehensive Review on Hashtag Recommendation: From Traditional to Deep Learning and Beyond

Authors: Shubhi Bansal, Kushaan Gowda, Anupama Sureshbabu K, Chirag Kothari, Nagendra Kumar

Abstract: The exponential growth of user-generated content on social media platforms has precipitated significant challenges in information management, particularly in content organization, retrieval, and discovery. Hashtags, as a fundamental categorization mechanism, play a pivotal role in enhancing content visibility and user engagement. However, the development of accurate and robust hashtag recommendati… ▽ More The exponential growth of user-generated content on social media platforms has precipitated significant challenges in information management, particularly in content organization, retrieval, and discovery. Hashtags, as a fundamental categorization mechanism, play a pivotal role in enhancing content visibility and user engagement. However, the development of accurate and robust hashtag recommendation systems remains a complex and evolving research challenge. Existing surveys in this domain are limited in scope and recency, focusing narrowly on specific platforms, methodologies, or timeframes. To address this gap, this review article conducts a systematic analysis of hashtag recommendation systems, comprehensively examining recent advancements across several dimensions. We investigate unimodal versus multimodal methodologies, diverse problem formulations, filtering strategies, methodological evolution from traditional frequency-based models to advanced deep learning architectures. Furthermore, we critically evaluate performance assessment paradigms, including quantitative metrics, qualitative analyses, and hybrid evaluation frameworks. Our analysis underscores a paradigm shift toward transformer-based deep learning models, which harness contextual and semantic features to achieve superior recommendation accuracy. Key challenges such as data sparsity, cold-start scenarios, polysemy, and model explainability are rigorously discussed, alongside practical applications in tweet classification, sentiment analysis, and content popularity prediction. By synthesizing insights from diverse methodological and platform-specific perspectives, this survey provides a structured taxonomy of current research, identifies unresolved gaps, and proposes future directions for developing adaptive, user-centric recommendation systems. △ Less

Submitted 25 March, 2025; v1 submitted 24 March, 2025; originally announced March 2025.

arXiv:2410.10403 [pdf, other]

Near-Pilotless MIMO Single Carrier Communications using Matrix Decomposition

Authors: Sai Praneeth K, P Aswathylakshmi, Radhakrishna Ganti

Abstract: Multiple Input-Multiple Output (MIMO) is a key enabler of higher data rates in the next generation wireless communications. However in MIMO systems, channel estimation and equalization are challenging particularly in the presence of rapidly changing channels. The high pilot overhead required for channel estimation can reduce the system throughput for large antenna configuration. In this paper, we… ▽ More Multiple Input-Multiple Output (MIMO) is a key enabler of higher data rates in the next generation wireless communications. However in MIMO systems, channel estimation and equalization are challenging particularly in the presence of rapidly changing channels. The high pilot overhead required for channel estimation can reduce the system throughput for large antenna configuration. In this paper, we provide an iterative matrix decomposition algorithm for near-pilotless or blind decoding of MIMO signals, in a single carrier system with frequency domain equalization. This novel approach replaces the standard equalization and estimates both the transmitted data and the channel without the knowledge of any prior distributions, by making use of only one pilot. Our simulations demonstrate improved performance, in terms of error rates, compared to the more widely used pilot-based Maximal Ratio Combining (MRC) method. △ Less

Submitted 14 October, 2024; originally announced October 2024.

Comments: 6 pages, 8 figures

arXiv:2409.14769 [pdf, other]

Language-Agnostic Analysis of Speech Depression Detection

Authors: Sona Binu, Jismi Jose, Fathima Shimna K V, Alino Luke Hans, Reni K. Cherian, Starlet Ben Alex, Priyanka Srivastava, Chiranjeevi Yarra

Abstract: The people with Major Depressive Disorder (MDD) exhibit the symptoms of tonal variations in their speech compared to the healthy counterparts. However, these tonal variations not only confine to the state of MDD but also on the language, which has unique tonal patterns. This work analyzes automatic speech-based depression detection across two languages, English and Malayalam, which exhibits distin… ▽ More The people with Major Depressive Disorder (MDD) exhibit the symptoms of tonal variations in their speech compared to the healthy counterparts. However, these tonal variations not only confine to the state of MDD but also on the language, which has unique tonal patterns. This work analyzes automatic speech-based depression detection across two languages, English and Malayalam, which exhibits distinctive prosodic and phonemic characteristics. We propose an approach that utilizes speech data collected along with self-reported labels from participants reading sentences from IViE corpus, in both English and Malayalam. The IViE corpus consists of five sets of sentences: simple sentences, WH-questions, questions without morphosyntactic markers, inversion questions and coordinations, that can naturally prompt speakers to speak in different tonal patterns. Convolutional Neural Networks (CNNs) are employed for detecting depression from speech. The CNN model is trained to identify acoustic features associated with depression in speech, focusing on both languages. The model's performance is evaluated on the collected dataset containing recordings from both depressed and non-depressed speakers, analyzing its effectiveness in detecting depression across the two languages. Our findings and collected data could contribute to the development of language-agnostic speech-based depression detection systems, thereby enhancing accessibility for diverse populations. △ Less

Submitted 23 September, 2024; originally announced September 2024.

arXiv:2407.19492 [pdf, other]

Heads Up eXperience (HUX): Always-On AI Companion for Human Computer Environment Interaction

Authors: Sukanth K, Sudhiksha Kandavel Rajan, Rajashekhar V S, Gowdham Prabhakar

Abstract: While current personal smart devices excel in digital domains, they fall short in assisting users during human environment interaction. This paper proposes Heads Up eXperience (HUX), an AI system designed to bridge this gap, serving as a constant companion across the extended reality (XR) environments. By tracking the user's eye gaze, analyzing the surrounding environment, and interpreting verbal… ▽ More While current personal smart devices excel in digital domains, they fall short in assisting users during human environment interaction. This paper proposes Heads Up eXperience (HUX), an AI system designed to bridge this gap, serving as a constant companion across the extended reality (XR) environments. By tracking the user's eye gaze, analyzing the surrounding environment, and interpreting verbal contexts, the system captures and enhances multi-modal data, providing holistic context interpretation and memory storage in real-time task specific situations. This comprehensive approach enables more natural, empathetic and intelligent interactions between the user and HUX AI, paving the path for human computer environment interaction. Intended for deployment in smart glasses and extended reality headsets, HUX AI aims to become a personal and useful AI companion for daily life. By integrating digital assistance with enhanced physical world interactions, this technology has the potential to revolutionize human-AI collaboration in both personal and professional spheres paving the way for the future of personal smart devices. △ Less

Submitted 28 July, 2024; originally announced July 2024.

Comments: 48 pages, 16 figures

arXiv:2407.13944 [pdf, other]

doi 10.1016/j.future.2024.07.001

PowerTrain: Fast, Generalizable Time and Power Prediction Models to Optimize DNN Training on Accelerated Edges

Authors: Prashanthi S. K., Saisamarth Taluri, Beautlin S, Lakshya Karwa, Yogesh Simmhan

Abstract: Accelerated edge devices, like Nvidia's Jetson with 1000+ CUDA cores, are increasingly used for DNN training and federated learning, rather than just for inferencing workloads. A unique feature of these compact devices is their fine-grained control over CPU, GPU, memory frequencies, and active CPU cores, which can limit their power envelope in a constrained setting while throttling the compute per… ▽ More Accelerated edge devices, like Nvidia's Jetson with 1000+ CUDA cores, are increasingly used for DNN training and federated learning, rather than just for inferencing workloads. A unique feature of these compact devices is their fine-grained control over CPU, GPU, memory frequencies, and active CPU cores, which can limit their power envelope in a constrained setting while throttling the compute performance. Given this vast 10k+ parameter space, selecting a power mode for dynamically arriving training workloads to exploit power-performance trade-offs requires costly profiling for each new workload, or is done \textit{ad hoc}. We propose \textit{PowerTrain}, a transfer-learning approach to accurately predict the power and time consumed when training a given DNN workload (model + dataset) using any specified power mode (CPU/GPU/memory frequencies, core-count). It requires a one-time offline profiling of $1000$s of power modes for a reference DNN workload on a single Jetson device (Orin AGX) to build Neural Network (NN) based prediction models for time and power. These NN models are subsequently transferred (retrained) for a new DNN workload, or even a different Jetson device, with minimal additional profiling of just $50$ power modes to make accurate time and power predictions. These are then used to rapidly construct the Pareto front and select the optimal power mode for the new workload. PowerTrain's predictions are robust to new workloads, exhibiting a low MAPE of $<6\%$ for power and $<15\%$ for time on six new training workloads for up to $4400$ power modes, when transferred from a ResNet reference workload on Orin AGX. It is also resilient when transferred to two entirely new Jetson devices with prediction errors of $<14.5\%$ and $<11\%$. These outperform baseline predictions by more than $10\%$ and baseline optimizations by up to $45\%$ on time and $88\%$ on power. △ Less

Submitted 18 July, 2024; originally announced July 2024.

Comments: Preprint of article in Elsevier's Future Generation Computer Systems (FGCS)

arXiv:2407.10902 [pdf]

Interpreting Hand gestures using Object Detection and Digits Classification

Authors: Sangeetha K, Balaji VS, Kamalesh P, Anirudh Ganapathy PS

Abstract: Hand gestures have evolved into a natural and intuitive means of engaging with technology. The objective of this research is to develop a robust system that can accurately recognize and classify hand gestures representing numbers. The proposed approach involves collecting a dataset of hand gesture images, preprocessing and enhancing the images, extracting relevant features, and training a machine… ▽ More Hand gestures have evolved into a natural and intuitive means of engaging with technology. The objective of this research is to develop a robust system that can accurately recognize and classify hand gestures representing numbers. The proposed approach involves collecting a dataset of hand gesture images, preprocessing and enhancing the images, extracting relevant features, and training a machine learning model. The advancement of computer vision technology and object detection techniques, in conjunction with OpenCV's capability to analyze and comprehend hand gestures, presents a chance to transform the identification of numerical digits and its potential applications. The advancement of computer vision technology and object identification technologies, along with OpenCV's capacity to analyze and interpret hand gestures, has the potential to revolutionize human interaction, boosting people's access to information, education, and employment opportunities. Keywords: Computer Vision, Machine learning, Deep Learning, Neural Networks △ Less

Submitted 15 July, 2024; originally announced July 2024.

arXiv:2407.08743 [pdf, other]

Genetic Bottleneck and the Emergence of High Intelligence by Scaling-out and High Throughput

Authors: Arifa Khan, Saravanan P, Venkatesan S. K.

Abstract: We study the biological evolution of low-latency natural neural networks for short-term survival, and its parallels in the development of low latency high-performance Central Processing Unit in computer design and architecture. The necessity of accurate high-quality display of motion picture led to the special processing units known as the GPU, just as how special visual cortex regions of animals… ▽ More We study the biological evolution of low-latency natural neural networks for short-term survival, and its parallels in the development of low latency high-performance Central Processing Unit in computer design and architecture. The necessity of accurate high-quality display of motion picture led to the special processing units known as the GPU, just as how special visual cortex regions of animals produced such low-latency computational capacity. The human brain, especially considered as nothing but a scaled-up version of a primate brain evolved in response to genomic bottleneck, producing a brain that is trainable and prunable by society, and as a further extension, invents language, writing and storage of narratives displaced in time and space. We conclude that this modern digital invention of social media and the archived collective common corpus has further evolved from just simple CPU-based low-latency fast retrieval to high-throughput parallel processing of data using GPUs to train Attention based Deep Learning Neural Networks producing Generative AI with aspects like toxicity, bias, memorization, hallucination, with intriguing close parallels in humans and their society. We show how this paves the way for constructive approaches to eliminating such drawbacks from human society and its proxy and collective large-scale mirror, the Generative AI of the LLMs. △ Less

Submitted 29 May, 2024; originally announced July 2024.

arXiv:2404.10086 [pdf]

Empowering Enterprise Development by Building and Deploying Admin Dashboard using Refine Framework

Authors: Sai Teja Gajjala, Devi Deepak Manchala, Bhargav Gummadelly, Naga Sailaja K

Abstract: This project proposes the development of an advanced admin dashboard tailored for enterprise development, leveraging the Refine framework, Ant Design, and GraphQL API. It promises heightened operational efficiency by optimizing backend integration and employing GraphQL's dynamic data subscription for real-time insights. With an emphasis on modern aesthetics and user-centric design, it ensures seam… ▽ More This project proposes the development of an advanced admin dashboard tailored for enterprise development, leveraging the Refine framework, Ant Design, and GraphQL API. It promises heightened operational efficiency by optimizing backend integration and employing GraphQL's dynamic data subscription for real-time insights. With an emphasis on modern aesthetics and user-centric design, it ensures seamless data visualization and management. Key functionalities encompass user administration, data visualization, CRUD operations, real-time notifications, and seamless integration with existing systems. The deliverable includes a deployable dashboard alongside comprehensive documentation, aiming to empower enterprise teams with a cutting-edge, data-driven solution. △ Less

Submitted 15 April, 2024; originally announced April 2024.

arXiv:2404.03908 [pdf, other]

Multi-Task Learning for Lung sound & Lung disease classification

Authors: Suma K V, Deepali Koppad, Preethi Kumar, Neha A Kantikar, Surabhi Ramesh

Abstract: In recent years, advancements in deep learning techniques have considerably enhanced the efficiency and accuracy of medical diagnostics. In this work, a novel approach using multi-task learning (MTL) for the simultaneous classification of lung sounds and lung diseases is proposed. Our proposed model leverages MTL with four different deep learning models such as 2D CNN, ResNet50, MobileNet and Dens… ▽ More In recent years, advancements in deep learning techniques have considerably enhanced the efficiency and accuracy of medical diagnostics. In this work, a novel approach using multi-task learning (MTL) for the simultaneous classification of lung sounds and lung diseases is proposed. Our proposed model leverages MTL with four different deep learning models such as 2D CNN, ResNet50, MobileNet and Densenet to extract relevant features from the lung sound recordings. The ICBHI 2017 Respiratory Sound Database was employed in the current study. The MTL for MobileNet model performed better than the other models considered, with an accuracy of74\% for lung sound analysis and 91\% for lung diseases classification. Results of the experimentation demonstrate the efficacy of our approach in classifying both lung sounds and lung diseases concurrently. In this study,using the demographic data of the patients from the database, risk level computation for Chronic Obstructive Pulmonary Disease is also carried out. For this computation, three machine learning algorithms namely Logistic Regression, SVM and Random Forest classifierswere employed. Among these ML algorithms, the Random Forest classifier had the highest accuracy of 92\%.This work helps in considerably reducing the physician's burden of not just diagnosing the pathology but also effectively communicating to the patient about the possible causes or outcomes. △ Less

Submitted 5 April, 2024; originally announced April 2024.

arXiv:2403.01186 [pdf]

Evault for legal records

Authors: Jeba N, Anas S, Anuragav S, Abhishek R, Sachin K

Abstract: Innovative solution for addressing the challenges in the legal records management system through a blockchain-based eVault platform. Our objective is to create a secure, transparent, and accessible ecosystem that caters to the needs of all stakeholders, including lawyers, judges, clients, and registrars. First and foremost, our solution is built on a robust blockchain platform like Ethereum harnes… ▽ More Innovative solution for addressing the challenges in the legal records management system through a blockchain-based eVault platform. Our objective is to create a secure, transparent, and accessible ecosystem that caters to the needs of all stakeholders, including lawyers, judges, clients, and registrars. First and foremost, our solution is built on a robust blockchain platform like Ethereum harnessing the power of smart contracts to manage access, permissions, and transactions effectively. This ensures the utmost security and transparency in every interaction within the system. To make our eVault system user-friendly, we've developed intuitive interfaces for all stakeholders. Lawyers, judges, clients, and even registrars can effortlessly upload and retrieve legal documents, track changes, and share information within the platform. But that's not all; we've gone a step further by incorporating a document creation and saving feature within our app and website. This feature allows users to generate and securely store legal documents, streamlining the entire documentation process. △ Less

Submitted 8 March, 2024; v1 submitted 2 March, 2024; originally announced March 2024.

Comments: Blockchain, evault, legal records

arXiv:2312.07220 [pdf, other]

Performance Characterization of Containerized DNN Training and Inference on Edge Accelerators

Authors: Prashanthi S. K., Vinayaka Hegde, Keerthana Patchava, Ankita Das, Yogesh Simmhan

Abstract: Edge devices have typically been used for DNN inferencing. The increase in the compute power of accelerated edges is leading to their use in DNN training also. As privacy becomes a concern on multi-tenant edge devices, Docker containers provide a lightweight virtualization mechanism to sandbox models. But their overheads for edge devices are not yet explored. In this work, we study the impact of c… ▽ More Edge devices have typically been used for DNN inferencing. The increase in the compute power of accelerated edges is leading to their use in DNN training also. As privacy becomes a concern on multi-tenant edge devices, Docker containers provide a lightweight virtualization mechanism to sandbox models. But their overheads for edge devices are not yet explored. In this work, we study the impact of containerized DNN inference and training workloads on an NVIDIA AGX Orin edge device and contrast it against bare metal execution on running time, CPU, GPU and memory utilization, and energy consumption. Our analysis shows that there are negligible containerization overheads for individually running DNN training and inference workloads. △ Less

Submitted 18 July, 2024; v1 submitted 12 December, 2023; originally announced December 2023.

Comments: Updated results of short paper published in HiPC 2023

arXiv:2312.01364 [pdf, other]

Tradeoff of age-of-information and power under reliability constraint for short-packet communication with block-length adaptation

Authors: Sudarsanan A. K., Vineeth B. S., Chandra R. Murthy

Abstract: In applications such as remote estimation and monitoring, update packets are transmitted by power-constrained devices using short-packet codes over wireless networks. Therefore, networks need to be end-to-end optimized using information freshness metrics such as age of information under transmit power and reliability constraints to ensure support for such applications. For short-packet coding, mod… ▽ More In applications such as remote estimation and monitoring, update packets are transmitted by power-constrained devices using short-packet codes over wireless networks. Therefore, networks need to be end-to-end optimized using information freshness metrics such as age of information under transmit power and reliability constraints to ensure support for such applications. For short-packet coding, modelling and understanding the effect of block codeword length on transmit power and other performance metrics is important. To understand the above optimization for short-packet coding, we consider the optimal tradeoff problem between age of information and transmit power under reliability constraints for short packet point-to-point communication model with an exogenous packet generation process. In contrast to prior work, we consider scheduling policies that can possibly adapt the block-length or transmission time of short packet codes in order to achieve the optimal tradeoff. We characterize the tradeoff using a semi-Markov decision process formulation. We also obtain analytical upper bounds as well as numerical, analytical, and asymptotic lower bounds on the optimal tradeoff. We show that in certain regimes, such as high reliability and high packet generation rate, non-adaptive scheduling policies (fixed transmission time policies) are close-to-optimal. Furthermore, in a high-power or in a low-power regime, non-adaptive as well as state-independent randomized scheduling policies are order-optimal. These results are corroborated by numerical and simulation experiments. The tradeoff is then characterized for a wireless point-to-point channel with block fading as well as for other packet generation models (including an age-dependent packet generation model). △ Less

Submitted 3 December, 2023; originally announced December 2023.

arXiv:2310.19760 [pdf]

Epidemic outbreak prediction using machine learning models

Authors: Akshara Pramod, JS Abhishek, Suganthi K

Abstract: In today's world,the risk of emerging and re-emerging epidemics have increased.The recent advancement in healthcare technology has made it possible to predict an epidemic outbreak in a region.Early prediction of an epidemic outbreak greatly helps the authorities to be prepared with the necessary medications and logistics required to keep things in control. In this article, we try to predict the ep… ▽ More In today's world,the risk of emerging and re-emerging epidemics have increased.The recent advancement in healthcare technology has made it possible to predict an epidemic outbreak in a region.Early prediction of an epidemic outbreak greatly helps the authorities to be prepared with the necessary medications and logistics required to keep things in control. In this article, we try to predict the epidemic outbreak (influenza, hepatitis and malaria) for the state of New York, USA using machine and deep learning algorithms, and a portal has been created for the same which can alert the authorities and health care organizations of the region in case of an outbreak. The algorithm takes historical data to predict the possible number of cases for 5 weeks into the future. Non-clinical factors like google search trends,social media data and weather data have also been used to predict the probability of an outbreak. △ Less

Submitted 30 October, 2023; originally announced October 2023.

Comments: 16 pages, 5 tables, 4 figures

arXiv:2310.14654 [pdf, ps, other]

SPRING-INX: A Multilingual Indian Language Speech Corpus by SPRING Lab, IIT Madras

Authors: Nithya R, Malavika S, Jordan F, Arjun Gangwar, Metilda N J, S Umesh, Rithik Sarab, Akhilesh Kumar Dubey, Govind Divakaran, Samudra Vijaya K, Suryakanth V Gangashetty

Abstract: India is home to a multitude of languages of which 22 languages are recognised by the Indian Constitution as official. Building speech based applications for the Indian population is a difficult problem owing to limited data and the number of languages and accents to accommodate. To encourage the language technology community to build speech based applications in Indian languages, we are open sour… ▽ More India is home to a multitude of languages of which 22 languages are recognised by the Indian Constitution as official. Building speech based applications for the Indian population is a difficult problem owing to limited data and the number of languages and accents to accommodate. To encourage the language technology community to build speech based applications in Indian languages, we are open sourcing SPRING-INX data which has about 2000 hours of legally sourced and manually transcribed speech data for ASR system building in Assamese, Bengali, Gujarati, Hindi, Kannada, Malayalam, Marathi, Odia, Punjabi and Tamil. This endeavor is by SPRING Lab , Indian Institute of Technology Madras and is a part of National Language Translation Mission (NLTM), funded by the Indian Ministry of Electronics and Information Technology (MeitY), Government of India. We describe the data collection and data cleaning process along with the data statistics in this paper. △ Less

Submitted 24 October, 2023; v1 submitted 23 October, 2023; originally announced October 2023.

Comments: 3 pages, About SPRING-INX Data

arXiv:2309.02572 [pdf]

Experience Capture in Shipbuilding through Computer Applications and Neural Networks

Authors: Sankaramangalam Ulhas Sangeet, Sivaprasad K, Yashwant R. Kamath

Abstract: It has always been a severe loss for any establishment when an experienced hand retires or moves to another firm. The specific details of what his job/position entails will always make the work more efficient. To curtail such losses, it is possible to implement a system that takes input from a new employee regarding the challenges he/she is facing and match it to a previous occurrence where someon… ▽ More It has always been a severe loss for any establishment when an experienced hand retires or moves to another firm. The specific details of what his job/position entails will always make the work more efficient. To curtail such losses, it is possible to implement a system that takes input from a new employee regarding the challenges he/she is facing and match it to a previous occurrence where someone else held his/her chair. This system could be made possible with input through the ages from the array of individuals who managed that particular job and processing this data through a neural network that recognizes the pattern. The paper is based on data collected from traditional wooden dhow builders and some of the modern day unconventional shipyards. Since the requirements for successful implementation in such scenarios seems too steep at the moment, an alternate approach has been suggested by implementation through the design processes across multiple shipyards. The process entails the traditional value passed down through generations regarding a particular profession and analysis has been done regarding how this knowledge/experience can be captured and preserved for future generations to work upon. A series of tools including SharePoint, MATLAB, and some similar software working in tandem can be used for the design of the same. This research will provide valuable insight as to how information sharing can be applied through generations for effective application of production capabilities. △ Less

Submitted 5 September, 2023; originally announced September 2023.

Comments: Case study on knowledge transfer among employees with varying levels of experience in the field of shipbuilding

Journal ref: International Conference on Computer Applications in Shipbuilding 2017, 26-28 September 2017, Singapore

arXiv:2308.05125 [pdf, other]

Two Novel Approaches to Detect Community: A Case Study of Omicron Lineage Variants PPI Network

Authors: Mamata Das, Selvakumar K., P. J. A. Alphonse

Abstract: The capacity to identify and analyze protein-protein interactions, along with their internal modular organization, plays a crucial role in comprehending the intricate mechanisms underlying biological processes at the molecular level. We can learn a lot about the structure and dynamics of these interactions by using network analysis. We can improve our understanding of the biological roots of disea… ▽ More The capacity to identify and analyze protein-protein interactions, along with their internal modular organization, plays a crucial role in comprehending the intricate mechanisms underlying biological processes at the molecular level. We can learn a lot about the structure and dynamics of these interactions by using network analysis. We can improve our understanding of the biological roots of disease pathogenesis by recognizing network communities. This knowledge, in turn, holds significant potential for driving advancements in drug discovery and facilitating personalized medicine approaches for disease treatment. In this study, we aimed to uncover the communities within the variant B.1.1.529 (Omicron virus) using two proposed novel algorithm (ABCDE and ALCDE) and four widely recognized algorithms: Girvan-Newman, Louvain, Leiden, and Label Propagation algorithm. Each of these algorithms has established prominence in the field and offers unique perspectives on identifying communities within complex networks. We also compare the networks by the global properties, statistic summary, subgraph count, graphlet and validate by the modulaity. By employing these approaches, we sought to gain deeper insights into the structural organization and interconnections present within the Omicron virus network. △ Less

Submitted 8 August, 2023; originally announced August 2023.

Comments: 23 pages, 11 figures

arXiv:2308.04697 [pdf, other]

An Analytical Study of Covid-19 Dataset using Graph-Based Clustering Algorithms

Authors: Mamata Das, P. J. A. Alphonse, Selvakumar K

Abstract: Corona VIrus Disease abbreviated as COVID-19 is a novel virus which is initially identified in Wuhan of China in December of 2019 and now this deadly disease has spread all over the world. According to World Health Organization (WHO), a total of 3,124,905 people died from 2019 to 2021, April. In this case, many methods, AI base techniques, and machine learning algorithms have been researched and a… ▽ More Corona VIrus Disease abbreviated as COVID-19 is a novel virus which is initially identified in Wuhan of China in December of 2019 and now this deadly disease has spread all over the world. According to World Health Organization (WHO), a total of 3,124,905 people died from 2019 to 2021, April. In this case, many methods, AI base techniques, and machine learning algorithms have been researched and are being used to save people from this pandemic. The SARS-CoV and the 2019-nCoV, SARS-CoV-2 virus invade our bodies, causing some differences in the structure of cell proteins. Protein-protein interaction (PPI) is an essential process in our cells and plays a very important role in the development of medicines and gives ideas about the disease. In this study, we performed clustering on PPI networks generated from 92 genes of the Covi-19 dataset. We have used three graph-based clustering algorithms to give intuition to the analysis of clusters. △ Less

Submitted 9 August, 2023; originally announced August 2023.

Comments: 9 pages, 28 figures, Fifth International Conference on Smart Computing and Informatics (SCI 2021)

Journal ref: Smart Intelligent Computing and Applications, Volume 1, SCI 2021

arXiv:2308.04037 [pdf]

A Comparative Study on TF-IDF feature Weighting Method and its Analysis using Unstructured Dataset

Authors: Mamata Das, Selvakumar K., P. J. A. Alphonse

Abstract: Text Classification is the process of categorizing text into the relevant categories and its algorithms are at the core of many Natural Language Processing (NLP). Term Frequency-Inverse Document Frequency (TF-IDF) and NLP are the most highly used information retrieval methods in text classification. We have investigated and analyzed the feature weighting method for text classification on unstructu… ▽ More Text Classification is the process of categorizing text into the relevant categories and its algorithms are at the core of many Natural Language Processing (NLP). Term Frequency-Inverse Document Frequency (TF-IDF) and NLP are the most highly used information retrieval methods in text classification. We have investigated and analyzed the feature weighting method for text classification on unstructured data. The proposed model considered two features N-Grams and TF-IDF on the IMDB movie reviews and Amazon Alexa reviews dataset for sentiment analysis. Then we have used the state-of-the-art classifier to validate the method i.e., Support Vector Machine (SVM), Logistic Regression, Multinomial Naive Bayes (Multinomial NB), Random Forest, Decision Tree, and k-nearest neighbors (KNN). From those two feature extractions, a significant increase in feature extraction with TF-IDF features rather than based on N-Gram. TF-IDF got the maximum accuracy (93.81%), precision (94.20%), recall (93.81%), and F1-score (91.99%) value in Random Forest classifier. △ Less

Submitted 8 August, 2023; originally announced August 2023.

Comments: 10 pages, 3 figures, COLINS-2021, 5th International Conference on Computational Linguistics and Intelligent Systems, April 22-23, 2021, Kharkiv, Ukraine

arXiv:2306.05680 [pdf, other]

doi 10.1109/EMBC40787.2023.10340389

Emotion Detection from EEG using Transfer Learning

Authors: Sidharth Sidharth, Ashish Abraham Samuel, Ranjana H, Jerrin Thomas Panachakel, Sana Parveen K

Abstract: The detection of emotions using an Electroencephalogram (EEG) is a crucial area in brain-computer interfaces and has valuable applications in fields such as rehabilitation and medicine. In this study, we employed transfer learning to overcome the challenge of limited data availability in EEG-based emotion detection. The base model used in this study was Resnet50. Additionally, we employed a novel… ▽ More The detection of emotions using an Electroencephalogram (EEG) is a crucial area in brain-computer interfaces and has valuable applications in fields such as rehabilitation and medicine. In this study, we employed transfer learning to overcome the challenge of limited data availability in EEG-based emotion detection. The base model used in this study was Resnet50. Additionally, we employed a novel feature combination in EEG-based emotion detection. The input to the model was in the form of an image matrix, which comprised Mean Phase Coherence (MPC) and Magnitude Squared Coherence (MSC) in the upper-triangular and lower-triangular matrices, respectively. We further improved the technique by incorporating features obtained from the Differential Entropy (DE) into the diagonal, which previously held little to no useful information for classifying emotions. The dataset used in this study, SEED EEG (62 channel EEG), comprises three classes (Positive, Neutral, and Negative). We calculated both subject-independent and subject-dependent accuracy. The subject-dependent accuracy was obtained using a 10-fold cross-validation method and was 93.1%, while the subject-independent classification was performed by employing the leave-one-subject-out (LOSO) strategy. The accuracy obtained in subject-independent classification was 71.6%. Both of these accuracies are at least twice better than the chance accuracy of classifying 3 classes. The study found the use of MSC and MPC in EEG-based emotion detection promising for emotion classification. The future scope of this work includes the use of data augmentation techniques, enhanced classifiers, and better features for emotion classification. △ Less

Submitted 9 June, 2023; originally announced June 2023.

Comments: Preprint of the manuscript accepted for presentation in 45th Annual International Conference of the IEEE Engineering in Medicine and Biology Society. DOI will be updated soon

Journal ref: 2023 45th Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC)

arXiv:2305.17791 [pdf, other]

LowDINO -- A Low Parameter Self Supervised Learning Model

Authors: Sai Krishna Prathapaneni, Shvejan Shashank, Srikar Reddy K

Abstract: This research aims to explore the possibility of designing a neural network architecture that allows for small networks to adopt the properties of huge networks, which have shown success in self-supervised learning (SSL), for all the downstream tasks like image classification, segmentation, etc. Previous studies have shown that using convolutional neural networks (ConvNets) can provide inherent in… ▽ More This research aims to explore the possibility of designing a neural network architecture that allows for small networks to adopt the properties of huge networks, which have shown success in self-supervised learning (SSL), for all the downstream tasks like image classification, segmentation, etc. Previous studies have shown that using convolutional neural networks (ConvNets) can provide inherent inductive bias, which is crucial for learning representations in deep learning models. To reduce the number of parameters, attention mechanisms are utilized through the usage of MobileViT blocks, resulting in a model with less than 5 million parameters. The model is trained using self-distillation with momentum encoder and a student-teacher architecture is also employed, where the teacher weights use vision transformers (ViTs) from recent SOTA SSL models. The model is trained on the ImageNet1k dataset. This research provides an approach for designing smaller, more efficient neural network architectures that can perform SSL tasks comparable to heavy models △ Less

Submitted 28 May, 2023; originally announced May 2023.

arXiv:2305.12741 [pdf, other]

Coswara: A respiratory sounds and symptoms dataset for remote screening of SARS-CoV-2 infection

Authors: Debarpan Bhattacharya, Neeraj Kumar Sharma, Debottam Dutta, Srikanth Raj Chetupalli, Pravin Mote, Sriram Ganapathy, Chandrakiran C, Sahiti Nori, Suhail K K, Sadhana Gonuguntla, Murali Alagesan

Abstract: This paper presents the Coswara dataset, a dataset containing diverse set of respiratory sounds and rich meta-data, recorded between April-2020 and February-2022 from 2635 individuals (1819 SARS-CoV-2 negative, 674 positive, and 142 recovered subjects). The respiratory sounds contained nine sound categories associated with variants of breathing, cough and speech. The rich metadata contained demogr… ▽ More This paper presents the Coswara dataset, a dataset containing diverse set of respiratory sounds and rich meta-data, recorded between April-2020 and February-2022 from 2635 individuals (1819 SARS-CoV-2 negative, 674 positive, and 142 recovered subjects). The respiratory sounds contained nine sound categories associated with variants of breathing, cough and speech. The rich metadata contained demographic information associated with age, gender and geographic location, as well as the health information relating to the symptoms, pre-existing respiratory ailments, comorbidity and SARS-CoV-2 test status. Our study is the first of its kind to manually annotate the audio quality of the entire dataset (amounting to 65~hours) through manual listening. The paper summarizes the data collection procedure, demographic, symptoms and audio data information. A COVID-19 classifier based on bi-directional long short-term (BLSTM) architecture, is trained and evaluated on the different population sub-groups contained in the dataset to understand the bias/fairness of the model. This enabled the analysis of the impact of gender, geographic location, date of recording, and language proficiency on the COVID-19 detection performance. △ Less

Submitted 22 May, 2023; originally announced May 2023.

Comments: Accepted for publiation in Nature Scientific Data

arXiv:2305.11063 [pdf]

Medical Data Asset Management and an Approach for Disease Prediction using Blockchain and Machine Learning

Authors: Shruthi K, Poornima A. S

Abstract: In the present medical services, the board, clinical well-being records are as electronic clinical record (EHR/EMR) frameworks. These frameworks store patients' clinical histories in a computerized design. Notwithstanding, a patient's clinical information is gained in a productive and ideal way and is demonstrated to be troublesome through these records. Powerlessness constantly prevents the well-… ▽ More In the present medical services, the board, clinical well-being records are as electronic clinical record (EHR/EMR) frameworks. These frameworks store patients' clinical histories in a computerized design. Notwithstanding, a patient's clinical information is gained in a productive and ideal way and is demonstrated to be troublesome through these records. Powerlessness constantly prevents the well-being of the board from getting data, less use of data obtained, unmanageable protection controls, and unfortunate information resource security. In this paper, we present an effective and safe clinical information resource, the executives' framework involving Blockchain, to determine these issues. Blockchain innovation facilitates the openness of all such records by keeping a block for each patient. This paper proposes an engineering utilizing an off-chain arrangement that will empower specialists and patients to get records in a protected manner. Blockchain makes clinical records permanent and scrambles them for information honesty. Clients can notice their well-being records, yet just patients own the confidential key and can impart it to those they want. Smart contracts likewise help our information proprietors to deal with their information access in a permission way. The eventual outcome will be seen as a web and portable connection point to get to, identify, and guarantee high-security information handily. In this adventure, we will give deals with any consequences regarding the issues associated with clinical consideration data and the chiefs using AI and Blockchain. Removing only the imperative information from the data is possible with the use of AI. This is done using arranged estimations. At the point when this data is taken care of, the accompanying issue is information sharing and its constancy. △ Less

Submitted 27 April, 2023; originally announced May 2023.

Comments: Journal

arXiv:2303.13974

Mixed-Type Wafer Classification For Low Memory Devices Using Knowledge Distillation

Authors: Nitish Shukla, Anurima Dey, Srivatsan K

Abstract: Manufacturing wafers is an intricate task involving thousands of steps. Defect Pattern Recognition (DPR) of wafer maps is crucial for determining the root cause of production defects, which may further provide insight for yield improvement in wafer foundry. During manufacturing, various defects may appear standalone in the wafer or may appear as different combinations. Identifying multiple defects… ▽ More Manufacturing wafers is an intricate task involving thousands of steps. Defect Pattern Recognition (DPR) of wafer maps is crucial for determining the root cause of production defects, which may further provide insight for yield improvement in wafer foundry. During manufacturing, various defects may appear standalone in the wafer or may appear as different combinations. Identifying multiple defects in a wafer is generally harder compared to identifying a single defect. Recently, deep learning methods have gained significant traction in mixed-type DPR. However, the complexity of defects requires complex and large models making them very difficult to operate on low-memory embedded devices typically used in fabrication labs. Another common issue is the unavailability of labeled data to train complex networks. In this work, we propose an unsupervised training routine to distill the knowledge of complex pre-trained models to lightweight deployment-ready models. We empirically show that this type of training compresses the model without sacrificing accuracy despite being up to 10 times smaller than the teacher model. The compressed model also manages to outperform contemporary state-of-the-art models. △ Less

Submitted 18 October, 2023; v1 submitted 24 March, 2023; originally announced March 2023.

Comments: Study is not relevant

arXiv:2207.12022 [pdf, other]

doi 10.1109/ACCESS.2023.3234625

Peer-to-Peer Sharing of Energy Storage Systems under Net Metering and Time-of-Use Pricing

Authors: K. Victor Sam Moses Babu, Satya Surya Vinay K, Pratyush Chakraborty

Abstract: Sharing economy has become a socio-economic trend in transportation and housing sectors. It develops business models leveraging underutilized resources. Like those sectors, power grid is also becoming smarter with many flexible resources, and researchers are investigating the impact of sharing resources here as well that can help to reduce cost and extract value. In this work, we investigate shari… ▽ More Sharing economy has become a socio-economic trend in transportation and housing sectors. It develops business models leveraging underutilized resources. Like those sectors, power grid is also becoming smarter with many flexible resources, and researchers are investigating the impact of sharing resources here as well that can help to reduce cost and extract value. In this work, we investigate sharing of energy storage devices among individual households in a cooperative fashion. Coalitional game theory is used to model the scenario where utility company imposes time-of-use (ToU) price and net metering billing mechanism. The resulting game has a non-empty core and we can develop a cost allocation mechanism with easy to compute analytical formula. Allocation is fair and cost effective for every household. We design the price for peer to peer network (P2P) and an algorithm for sharing that keeps the grand coalition always stable. Thus sharing electricity of storage devices among consumers can be effective in this set-up. Our mechanism is implemented in a community of 80 households in Texas using real data of demand and solar irradiance and the results show significant cost savings for our method. △ Less

Submitted 1 October, 2022; v1 submitted 25 July, 2022; originally announced July 2022.

arXiv:2206.12309 [pdf, other]

Analyzing the impact of SARS-CoV-2 variants on respiratory sound signals

Authors: Debarpan Bhattacharya, Debottam Dutta, Neeraj Kumar Sharma, Srikanth Raj Chetupalli, Pravin Mote, Sriram Ganapathy, Chandrakiran C, Sahiti Nori, Suhail K K, Sadhana Gonuguntla, Murali Alagesan

Abstract: The COVID-19 outbreak resulted in multiple waves of infections that have been associated with different SARS-CoV-2 variants. Studies have reported differential impact of the variants on respiratory health of patients. We explore whether acoustic signals, collected from COVID-19 subjects, show computationally distinguishable acoustic patterns suggesting a possibility to predict the underlying virus… ▽ More The COVID-19 outbreak resulted in multiple waves of infections that have been associated with different SARS-CoV-2 variants. Studies have reported differential impact of the variants on respiratory health of patients. We explore whether acoustic signals, collected from COVID-19 subjects, show computationally distinguishable acoustic patterns suggesting a possibility to predict the underlying virus variant. We analyze the Coswara dataset which is collected from three subject pools, namely, i) healthy, ii) COVID-19 subjects recorded during the delta variant dominant period, and iii) data from COVID-19 subjects recorded during the omicron surge. Our findings suggest that multiple sound categories, such as cough, breathing, and speech, indicate significant acoustic feature differences when comparing COVID-19 subjects with omicron and delta variants. The classification areas-under-the-curve are significantly above chance for differentiating subjects infected by omicron from those infected by delta. Using a score fusion from multiple sound categories, we obtained an area-under-the-curve of 89% and 52.4% sensitivity at 95% specificity. Additionally, a hierarchical three class approach was used to classify the acoustic data into healthy and COVID-19 positive, and further COVID-19 subjects into delta and omicron variants providing high level of 3-class classification accuracy. These results suggest new ways for designing sound based COVID-19 diagnosis approaches. △ Less

Submitted 24 June, 2022; originally announced June 2022.

Journal ref: Interspeech, 2022

arXiv:2206.05053 [pdf, other]

Coswara: A website application enabling COVID-19 screening by analysing respiratory sound samples and health symptoms

Authors: Debarpan Bhattacharya, Debottam Dutta, Neeraj Kumar Sharma, Srikanth Raj Chetupalli, Pravin Mote, Sriram Ganapathy, Chandrakiran C, Sahiti Nori, Suhail K K, Sadhana Gonuguntla, Murali Alagesan

Abstract: The COVID-19 pandemic has accelerated research on design of alternative, quick and effective COVID-19 diagnosis approaches. In this paper, we describe the Coswara tool, a website application designed to enable COVID-19 detection by analysing respiratory sound samples and health symptoms. A user using this service can log into a website using any device connected to the internet, provide there curr… ▽ More The COVID-19 pandemic has accelerated research on design of alternative, quick and effective COVID-19 diagnosis approaches. In this paper, we describe the Coswara tool, a website application designed to enable COVID-19 detection by analysing respiratory sound samples and health symptoms. A user using this service can log into a website using any device connected to the internet, provide there current health symptom information and record few sound sampled corresponding to breathing, cough, and speech. Within a minute of analysis of this information on a cloud server the website tool will output a COVID-19 probability score to the user. As the COVID-19 pandemic continues to demand massive and scalable population level testing, we hypothesize that the proposed tool provides a potential solution towards this. △ Less

Submitted 9 June, 2022; originally announced June 2022.

Journal ref: Interspeech, 2022

arXiv:2111.14707 [pdf]

Real-time Attention Span Tracking in Online Education

Authors: Rahul RK, Shanthakumar S, Vykunth P, Sairamnath K

Abstract: Over the last decade, e-learning has revolutionized how students learn by providing them access to quality education whenever and wherever they want. However, students often get distracted because of various reasons, which affect the learning capacity to a great extent. Many researchers have been trying to improve the quality of online education, but we need a holistic approach to address this iss… ▽ More Over the last decade, e-learning has revolutionized how students learn by providing them access to quality education whenever and wherever they want. However, students often get distracted because of various reasons, which affect the learning capacity to a great extent. Many researchers have been trying to improve the quality of online education, but we need a holistic approach to address this issue. This paper intends to provide a mechanism that uses the camera feed and microphone input to monitor the real-time attention level of students during online classes. We explore various image processing techniques and machine learning algorithms throughout this study. We propose a system that uses five distinct non-verbal features to calculate the attention score of the student during computer based tasks and generate real-time feedback for both students and the organization. We can use the generated feedback as a heuristic value to analyze the overall performance of students as well as the teaching standards of the lecturers. △ Less

Submitted 29 November, 2021; originally announced November 2021.

arXiv:2108.10034 [pdf, other]

Collation of Feasible Solutions for Domain Based Problems: An Analysis of Sentiments Based on Codeathon Activity

Authors: Rajeshwari K, Preetha S, Anitha C, Lakshmi Shree K, Pronoy Roy

Abstract: Codeathon activity is a practical approach for enduring the principles of Software Engineering and Object Oriented Modelling. Real world domain problem's solution was accomplished through team work. Analysing the problem and designing a feasible solution through a one day activity was achieved through virtual connection. There are three different sections in a semester, 13 teams were framed and as… ▽ More Codeathon activity is a practical approach for enduring the principles of Software Engineering and Object Oriented Modelling. Real world domain problem's solution was accomplished through team work. Analysing the problem and designing a feasible solution through a one day activity was achieved through virtual connection. There are three different sections in a semester, 13 teams were framed and assigned one problem statement. Individual team were supposed to prototype a solution which was further used to build one feasible solution. The feedback from students showed different sentiments associated with day long activity. Vivid emotions and expressions of students were analysed. △ Less

Submitted 23 August, 2021; originally announced August 2021.

Comments: 10 pages, 15 figures, 1 table

arXiv:2107.05840 [pdf, other]

doi 10.1007/978-3-030-87193-2_16

NucMM Dataset: 3D Neuronal Nuclei Instance Segmentation at Sub-Cubic Millimeter Scale

Authors: Zudi Lin, Donglai Wei, Mariela D. Petkova, Yuelong Wu, Zergham Ahmed, Krishna Swaroop K, Silin Zou, Nils Wendt, Jonathan Boulanger-Weill, Xueying Wang, Nagaraju Dhanyasi, Ignacio Arganda-Carreras, Florian Engert, Jeff Lichtman, Hanspeter Pfister

Abstract: Segmenting 3D cell nuclei from microscopy image volumes is critical for biological and clinical analysis, enabling the study of cellular expression patterns and cell lineages. However, current datasets for neuronal nuclei usually contain volumes smaller than $10^{\text{-}3}\ mm^3$ with fewer than 500 instances per volume, unable to reveal the complexity in large brain regions and restrict the inve… ▽ More Segmenting 3D cell nuclei from microscopy image volumes is critical for biological and clinical analysis, enabling the study of cellular expression patterns and cell lineages. However, current datasets for neuronal nuclei usually contain volumes smaller than $10^{\text{-}3}\ mm^3$ with fewer than 500 instances per volume, unable to reveal the complexity in large brain regions and restrict the investigation of neuronal structures. In this paper, we have pushed the task forward to the sub-cubic millimeter scale and curated the NucMM dataset with two fully annotated volumes: one $0.1\ mm^3$ electron microscopy (EM) volume containing nearly the entire zebrafish brain with around 170,000 nuclei; and one $0.25\ mm^3$ micro-CT (uCT) volume containing part of a mouse visual cortex with about 7,000 nuclei. With two imaging modalities and significantly increased volume size and instance numbers, we discover a great diversity of neuronal nuclei in appearance and density, introducing new challenges to the field. We also perform a statistical analysis to illustrate those challenges quantitatively. To tackle the challenges, we propose a novel hybrid-representation learning model that combines the merits of foreground mask, contour map, and signed distance transform to produce high-quality 3D masks. The benchmark comparisons on the NucMM dataset show that our proposed method significantly outperforms state-of-the-art nuclei segmentation approaches. Code and data are available at https://connectomics-bazaar.github.io/proj/nucMM/index.html. △ Less

Submitted 7 December, 2021; v1 submitted 13 July, 2021; originally announced July 2021.

Comments: MICCAI 2021. Fix typos and update citations

arXiv:2106.12665 [pdf, other]

Reimagining GNN Explanations with ideas from Tabular Data

Authors: Anjali Singh, Shamanth R Nayak K, Balaji Ganesan

Abstract: Explainability techniques for Graph Neural Networks still have a long way to go compared to explanations available for both neural and decision decision tree-based models trained on tabular data. Using a task that straddles both graphs and tabular data, namely Entity Matching, we comment on key aspects of explainability that are missing in GNN model explanations. Explainability techniques for Graph Neural Networks still have a long way to go compared to explanations available for both neural and decision decision tree-based models trained on tabular data. Using a task that straddles both graphs and tabular data, namely Entity Matching, we comment on key aspects of explainability that are missing in GNN model explanations. △ Less

Submitted 23 June, 2021; originally announced June 2021.

Comments: 4 pages, 8 figures, XAI Workshop at ICML 2021

arXiv:2104.07326 [pdf, other]

EnvGAN: Adversarial Synthesis of Environmental Sounds for Data Augmentation

Authors: Aswathy Madhu, Suresh K

Abstract: The research in Environmental Sound Classification (ESC) has been progressively growing with the emergence of deep learning algorithms. However, data scarcity poses a major hurdle for any huge advance in this domain. Data augmentation offers an excellent solution to this problem. While Generative Adversarial Networks (GANs) have been successful in generating synthetic speech and sounds of musical… ▽ More The research in Environmental Sound Classification (ESC) has been progressively growing with the emergence of deep learning algorithms. However, data scarcity poses a major hurdle for any huge advance in this domain. Data augmentation offers an excellent solution to this problem. While Generative Adversarial Networks (GANs) have been successful in generating synthetic speech and sounds of musical instruments, they have hardly been applied to the generation of environmental sounds. This paper presents EnvGAN, the first ever application of GANs for the adversarial generation of environmental sounds. Our experiments on three standard ESC datasets illustrate that the EnvGAN can synthesize audio similar to the ones in the datasets. The suggested method of augmentation outshines most of the futuristic techniques for audio augmentation. △ Less

Submitted 15 April, 2021; originally announced April 2021.

Comments: Submitted to IEEE Transactions on Audio, Speech and Language Processing

arXiv:2103.04032 [pdf, other]

CAM-GAN: Continual Adaptation Modules for Generative Adversarial Networks

Authors: Sakshi Varshney, Vinay Kumar Verma, Srijith P K, Lawrence Carin, Piyush Rai

Abstract: We present a continual learning approach for generative adversarial networks (GANs), by designing and leveraging parameter-efficient feature map transformations. Our approach is based on learning a set of global and task-specific parameters. The global parameters are fixed across tasks whereas the task-specific parameters act as local adapters for each task, and help in efficiently obtaining task-… ▽ More We present a continual learning approach for generative adversarial networks (GANs), by designing and leveraging parameter-efficient feature map transformations. Our approach is based on learning a set of global and task-specific parameters. The global parameters are fixed across tasks whereas the task-specific parameters act as local adapters for each task, and help in efficiently obtaining task-specific feature maps. Moreover, we propose an element-wise addition of residual bias in the transformed feature space, which further helps stabilize GAN training in such settings. Our approach also leverages task similarity information based on the Fisher information matrix. Leveraging this knowledge from previous tasks significantly improves the model performance. In addition, the similarity measure also helps reduce the parameter growth in continual adaptation and helps to learn a compact model. In contrast to the recent approaches for continually-learned GANs, the proposed approach provides a memory-efficient way to perform effective continual data generation. Through extensive experiments on challenging and diverse datasets, we show that the feature-map-transformation approach outperforms state-of-the-art methods for continually-learned GANs, with substantially fewer parameters. The proposed method generates high-quality samples that can also improve the generative-replay-based continual learning for discriminative tasks. △ Less

Submitted 30 July, 2021; v1 submitted 6 March, 2021; originally announced March 2021.

Comments: Under Submission

arXiv:2101.06459 [pdf, other]

Robustness to Augmentations as a Generalization metric

Authors: Sumukh Aithal K, Dhruva Kashyap, Natarajan Subramanyam

Abstract: Generalization is the ability of a model to predict on unseen domains and is a fundamental task in machine learning. Several generalization bounds, both theoretical and empirical have been proposed but they do not provide tight bounds .In this work, we propose a simple yet effective method to predict the generalization performance of a model by using the concept that models that are robust to augm… ▽ More Generalization is the ability of a model to predict on unseen domains and is a fundamental task in machine learning. Several generalization bounds, both theoretical and empirical have been proposed but they do not provide tight bounds .In this work, we propose a simple yet effective method to predict the generalization performance of a model by using the concept that models that are robust to augmentations are more generalizable than those which are not. We experiment with several augmentations and composition of augmentations to check the generalization capacity of a model. We also provide a detailed motivation behind the proposed method. The proposed generalization metric is calculated based on the change in the output of the model after augmenting the input. The proposed method was the first runner up solution for the NeurIPS competition on Predicting Generalization in Deep Learning. △ Less

Submitted 16 January, 2021; originally announced January 2021.

arXiv:2010.05690

COVID-19 Classification Using Staked Ensembles: A Comprehensive Analysis

Authors: Lalith Bharadwaj B, Rohit Boddeda, Sai Vardhan K, Madhu G

Abstract: The issue of COVID-19, increasing with a massive mortality rate. This led to the WHO declaring it as a pandemic. In this situation, it is crucial to perform efficient and fast diagnosis. The reverse transcript polymerase chain reaction (RTPCR) test is conducted to detect the presence of SARS-CoV-2. This test is time-consuming and instead chest CT (or Chest X-ray) can be used for a fast and accurat… ▽ More The issue of COVID-19, increasing with a massive mortality rate. This led to the WHO declaring it as a pandemic. In this situation, it is crucial to perform efficient and fast diagnosis. The reverse transcript polymerase chain reaction (RTPCR) test is conducted to detect the presence of SARS-CoV-2. This test is time-consuming and instead chest CT (or Chest X-ray) can be used for a fast and accurate diagnosis. Automated diagnosis is considered to be important as it reduces human effort and provides accurate and low-cost tests. The contributions of our research are three-fold. First, it is aimed to analyse the behaviour and performance of variant vision models ranging from Inception to NAS networks with the appropriate fine-tuning procedure. Second, the behaviour of these models is visually analysed by plotting CAMs for individual networks and determining classification performance with AUCROC curves. Thirdly, stacked ensembles techniques are imparted to provide higher generalisation on combining the fine-tuned models, in which six ensemble neural networks are designed by combining the existing fine-tuned networks. Implying these stacked ensembles provides a great generalization to the models. The ensemble model designed by combining all the fine-tuned networks obtained a state-of-the-art accuracy score of 99.17%. The precision and recall for the COVID-19 class are 99.99% and 89.79% respectively, which resembles the robustness of the stacked ensembles. △ Less

Submitted 7 August, 2021; v1 submitted 7 October, 2020; originally announced October 2020.

Comments: This paper has serious technical concerns. The diagnostic model which was built is inaccurate and the results are flawed

arXiv:2010.03023 [pdf, other]

IS-CAM: Integrated Score-CAM for axiomatic-based explanations

Authors: Rakshit Naidu, Ankita Ghosh, Yash Maurya, Shamanth R Nayak K, Soumya Snigdha Kundu

Abstract: Convolutional Neural Networks have been known as black-box models as humans cannot interpret their inner functionalities. With an attempt to make CNNs more interpretable and trustworthy, we propose IS-CAM (Integrated Score-CAM), where we introduce the integration operation within the Score-CAM pipeline to achieve visually sharper attribution maps quantitatively. Our method is evaluated on 2000 ran… ▽ More Convolutional Neural Networks have been known as black-box models as humans cannot interpret their inner functionalities. With an attempt to make CNNs more interpretable and trustworthy, we propose IS-CAM (Integrated Score-CAM), where we introduce the integration operation within the Score-CAM pipeline to achieve visually sharper attribution maps quantitatively. Our method is evaluated on 2000 randomly selected images from the ILSVRC 2012 Validation dataset, which proves the versatility of IS-CAM to account for different models and methods. △ Less

Submitted 6 October, 2020; originally announced October 2020.

Comments: 8 pages

arXiv:2009.02942 [pdf]

Detection of Colluded Black-hole and Grey-hole attacks in Cloud Computing

Authors: Divyasree I R, Selvamani K, Riasudheen H

Abstract: The availability of the high-capacity network, massive storage, hardware virtualization, utility computing, service-oriented architecture leads to high accessibility of cloud computing. The extensive usage of cloud resources causes oodles of security controversies. Black-hole & Gray-hole attacks are the notable cloud network defenseless attacks while they launched easily but difficult to detect. T… ▽ More The availability of the high-capacity network, massive storage, hardware virtualization, utility computing, service-oriented architecture leads to high accessibility of cloud computing. The extensive usage of cloud resources causes oodles of security controversies. Black-hole & Gray-hole attacks are the notable cloud network defenseless attacks while they launched easily but difficult to detect. This research work focuses on proposing an efficient integrated detection method for individual and collusion attacks in cloud computing. In the individual attack detection phase, the forwarding ratio metric is used for differentiating the malicious node and normal nodes. In the collusion attack detection phase, the malicious nodes are manipulated the encounter records for escaping the detection process. To overcome this user, fake encounters are examined along with appearance frequency, and the number of messages exploits abnormal patterns. The simulation results shown in this proposed system detect with better accuracy. △ Less

Submitted 7 September, 2020; originally announced September 2020.

arXiv:2007.01787 [pdf, other]

Evaluating Uncertainty Estimation Methods on 3D Semantic Segmentation of Point Clouds

Authors: Swaroop Bhandary K, Nico Hochgeschwender, Paul Plöger, Frank Kirchner, Matias Valdenegro-Toro

Abstract: Deep learning models are extensively used in various safety critical applications. Hence these models along with being accurate need to be highly reliable. One way of achieving this is by quantifying uncertainty. Bayesian methods for UQ have been extensively studied for Deep Learning models applied on images but have been less explored for 3D modalities such as point clouds often used for Robots a… ▽ More Deep learning models are extensively used in various safety critical applications. Hence these models along with being accurate need to be highly reliable. One way of achieving this is by quantifying uncertainty. Bayesian methods for UQ have been extensively studied for Deep Learning models applied on images but have been less explored for 3D modalities such as point clouds often used for Robots and Autonomous Systems. In this work, we evaluate three uncertainty quantification methods namely Deep Ensembles, MC-Dropout and MC-DropConnect on the DarkNet21Seg 3D semantic segmentation model and comprehensively analyze the impact of various parameters such as number of models in ensembles or forward passes, and drop probability values, on task performance and uncertainty estimate quality. We find that Deep Ensembles outperforms other methods in both performance and uncertainty metrics. Deep ensembles outperform other methods by a margin of 2.4% in terms of mIOU, 1.3% in terms of accuracy, while providing reliable uncertainty for decision making. △ Less

Submitted 3 July, 2020; originally announced July 2020.

Comments: 12 pages, 19 figures, ICML 2020 Workshop on Uncertainty and Robustness in Deep Learning

arXiv:2006.07677 [pdf, ps, other]

doi 10.1007/s00500-023-08752-2

Total Coloring for some classes of Cayley graphs

Authors: Prajnanaswaroopa S, Geetha J, Somasundaram K

Abstract: The Total coloring conjecture states that any simple graph G with maximum degree D can be totally colored with at most D+2 colors. In this paper, we have obtained the total chromatic number for some classes of Cayley graphs. The Total coloring conjecture states that any simple graph G with maximum degree D can be totally colored with at most D+2 colors. In this paper, we have obtained the total chromatic number for some classes of Cayley graphs. △ Less

Submitted 4 July, 2020; v1 submitted 13 June, 2020; originally announced June 2020.

Comments: 11 pages

Report number: 27, 15609--15617 (2023) MSC Class: 05C15 ACM Class: G.2.2; G.2.1

Journal ref: Soft Computing, Springer (2023)

arXiv:2006.07580 [pdf, other]

Modeling Implicit Communities using Spatio-Temporal Point Processes from Geo-tagged Event Traces

Authors: Ankita Likhyani, Vinayak Gupta, Srijith P. K., Deepak P., Srikanta Bedathur

Abstract: The location check-ins of users through various location-based services such as Foursquare, Twitter, and Facebook Places, etc., generate large traces of geo-tagged events. These event-traces often manifest in hidden (possibly overlapping) communities of users with similar interests. Inferring these implicit communities is crucial for forming user profiles for improvements in recommendation and pre… ▽ More The location check-ins of users through various location-based services such as Foursquare, Twitter, and Facebook Places, etc., generate large traces of geo-tagged events. These event-traces often manifest in hidden (possibly overlapping) communities of users with similar interests. Inferring these implicit communities is crucial for forming user profiles for improvements in recommendation and prediction tasks. Given only time-stamped geo-tagged traces of users, can we find out these implicit communities, and characteristics of the underlying influence network? Can we use this network to improve the next location prediction task? In this paper, we focus on the problem of community detection as well as capturing the underlying diffusion process and propose a model COLAB based on Spatio-temporal point processes in continuous time but discrete space of locations that simultaneously models the implicit communities of users based on their check-in activities, without making use of their social network connections. COLAB captures the semantic features of the location, user-to-user influence along with spatial and temporal preferences of users. To learn the latent community of users and model parameters, we propose an algorithm based on stochastic variational inference. To the best of our knowledge, this is the first attempt at jointly modeling the diffusion process with activity-driven implicit communities. We demonstrate COLAB achieves up to 27% improvements in location prediction task over recent deep point-process based methods on geo-tagged event traces collected from Foursquare check-ins. △ Less

Submitted 13 June, 2020; originally announced June 2020.

Comments: 17 pages

arXiv:2006.03317 [pdf]

Securing IoT Applications using Blockchain: A Survey

Authors: Sreelakshmi K. K., Ashutosh Bhatia, Ankit Agrawal

Abstract: The Internet of Things (IoT) has become a guiding technology behind automation and smart computing. One of the major concerns with the IoT systems is the lack of privacy and security preserving schemes for controlling access and ensuring the security of the data. A majority of security issues arise because of the centralized architecture of IoT systems. Another concern is the lack of proper authen… ▽ More The Internet of Things (IoT) has become a guiding technology behind automation and smart computing. One of the major concerns with the IoT systems is the lack of privacy and security preserving schemes for controlling access and ensuring the security of the data. A majority of security issues arise because of the centralized architecture of IoT systems. Another concern is the lack of proper authentication and access control schemes to moderate access to information generated by the IoT devices. So the question that arises is how to ensure the identity of the equipment or the communicating node. The answer to secure operations in a trustless environment brings us to the decentralized solution of Blockchain. A lot of research has been going on in the area of convergence of IoT and Blockchain, and it has resulted in some remarkable progress in addressing some of the significant issues in the IoT arena. This work reviews the challenges and threats in the IoT environment and how integration with Blockchain can resolve some of them. △ Less

Submitted 5 June, 2020; originally announced June 2020.

arXiv:2004.11726 [pdf, other]

A Two-Stage Multiple Instance Learning Framework for the Detection of Breast Cancer in Mammograms

Authors: Sarath Chandra K, Arunava Chakravarty, Nirmalya Ghosh, Tandra Sarkar, Ramanathan Sethuraman, Debdoot Sheet

Abstract: Mammograms are commonly employed in the large scale screening of breast cancer which is primarily characterized by the presence of malignant masses. However, automated image-level detection of malignancy is a challenging task given the small size of the mass regions and difficulty in discriminating between malignant, benign mass and healthy dense fibro-glandular tissue. To address these issues, we… ▽ More Mammograms are commonly employed in the large scale screening of breast cancer which is primarily characterized by the presence of malignant masses. However, automated image-level detection of malignancy is a challenging task given the small size of the mass regions and difficulty in discriminating between malignant, benign mass and healthy dense fibro-glandular tissue. To address these issues, we explore a two-stage Multiple Instance Learning (MIL) framework. A Convolutional Neural Network (CNN) is trained in the first stage to extract local candidate patches in the mammograms that may contain either a benign or malignant mass. The second stage employs a MIL strategy for an image level benign vs. malignant classification. A global image-level feature is computed as a weighted average of patch-level features learned using a CNN. Our method performed well on the task of localization of masses with an average Precision/Recall of 0.76/0.80 and acheived an average AUC of 0.91 on the imagelevel classification task using a five-fold cross-validation on the INbreast dataset. Restricting the MIL only to the candidate patches extracted in Stage 1 led to a significant improvement in classification performance in comparison to a dense extraction of patches from the entire mammogram. △ Less

Submitted 24 April, 2020; originally announced April 2020.

Comments: accepted in EMBC 2020, 4 pg+1 pg Supplementary

arXiv:2004.04812 [pdf, other]

Deep Learning based Frameworks for Handling Imbalance in DGA, Email, and URL Data Analysis

Authors: Simran K, Prathiksha Balakrishna, Vinayakumar Ravi, Soman KP

Abstract: Deep learning is a state of the art method for a lot of applications. The main issue is that most of the real-time data is highly imbalanced in nature. In order to avoid bias in training, cost-sensitive approach can be used. In this paper, we propose cost-sensitive deep learning based frameworks and the performance of the frameworks is evaluated on three different Cyber Security use cases which ar… ▽ More Deep learning is a state of the art method for a lot of applications. The main issue is that most of the real-time data is highly imbalanced in nature. In order to avoid bias in training, cost-sensitive approach can be used. In this paper, we propose cost-sensitive deep learning based frameworks and the performance of the frameworks is evaluated on three different Cyber Security use cases which are Domain Generation Algorithm (DGA), Electronic mail (Email), and Uniform Resource Locator (URL). Various experiments were performed using cost-insensitive as well as cost-sensitive methods and parameters for both of these methods are set based on hyperparameter tuning. In all experiments, the cost-sensitive deep learning methods performed better than the cost-insensitive approaches. This is mainly due to the reason that cost-sensitive approach gives importance to the classes which have a very less number of samples during training and this helps to learn all the classes in a more efficient manner. △ Less

Submitted 17 October, 2020; v1 submitted 30 March, 2020; originally announced April 2020.

Comments: 12 pages

arXiv:2004.00503 [pdf, other]

Deep Learning Approach for Enhanced Cyber Threat Indicators in Twitter Stream

Authors: Simran K, Prathiksha Balakrishna, Vinayakumar R, Soman KP

Abstract: In recent days, the amount of Cyber Security text data shared via social media resources mainly Twitter has increased. An accurate analysis of this data can help to develop cyber threat situational awareness framework for a cyber threat. This work proposes a deep learning based approach for tweet data analysis. To convert the tweets into numerical representations, various text representations are… ▽ More In recent days, the amount of Cyber Security text data shared via social media resources mainly Twitter has increased. An accurate analysis of this data can help to develop cyber threat situational awareness framework for a cyber threat. This work proposes a deep learning based approach for tweet data analysis. To convert the tweets into numerical representations, various text representations are employed. These features are feed into deep learning architecture for optimal feature extraction as well as classification. Various hyperparameter tuning approaches are used for identifying optimal text representation method as well as optimal network parameters and network structures for deep learning models. For comparative analysis, the classical text representation method with classical machine learning algorithm is employed. From the detailed analysis of experiments, we found that the deep learning architecture with advanced text representation methods performed better than the classical text representation and classical machine learning algorithms. The primary reason for this is that the advanced text representation methods have the capability to learn sequential properties which exist among the textual data and deep learning architectures learns the optimal features along with decreasing the feature size. △ Less

Submitted 30 March, 2020; originally announced April 2020.

Comments: 11 pages

arXiv:2004.00502 [pdf, other]

Deep Learning Approach for Intelligent Named Entity Recognition of Cyber Security

Authors: Simran K, Sriram S, Vinayakumar R, Soman KP

Abstract: In recent years, the amount of Cyber Security data generated in the form of unstructured texts, for example, social media resources, blogs, articles, and so on has exceptionally increased. Named Entity Recognition (NER) is an initial step towards converting this unstructured data into structured data which can be used by a lot of applications. The existing methods on NER for Cyber Security data ar… ▽ More In recent years, the amount of Cyber Security data generated in the form of unstructured texts, for example, social media resources, blogs, articles, and so on has exceptionally increased. Named Entity Recognition (NER) is an initial step towards converting this unstructured data into structured data which can be used by a lot of applications. The existing methods on NER for Cyber Security data are based on rules and linguistic characteristics. A Deep Learning (DL) based approach embedded with Conditional Random Fields (CRFs) is proposed in this paper. Several DL architectures are evaluated to find the most optimal architecture. The combination of Bidirectional Gated Recurrent Unit (Bi-GRU), Convolutional Neural Network (CNN), and CRF performed better compared to various other DL frameworks on a publicly available benchmark dataset. This may be due to the reason that the bidirectional structures preserve the features related to the future and previous words in a sequence. △ Less

Submitted 30 March, 2020; originally announced April 2020.

Comments: 10 pages

arXiv:2001.10094 [pdf]

OMAP-L138 LCDK Development Kit

Authors: Bharath K P, Sylash K, Pravina K, Rajesh Kumar M

Abstract: Low cost and low power consumption processor play a vital role in the field of Digital Signal Processing (DSP). The OMAP-L138 development kit which is low cost, low power consumption, ease and speed, with a wide variety of applications includes Digital signal processing, Image processing and video processing. This paper represents the basic introduction to OMAP-L138 processor and quick procedural… ▽ More Low cost and low power consumption processor play a vital role in the field of Digital Signal Processing (DSP). The OMAP-L138 development kit which is low cost, low power consumption, ease and speed, with a wide variety of applications includes Digital signal processing, Image processing and video processing. This paper represents the basic introduction to OMAP-L138 processor and quick procedural steps for real time and non-real time implementations with a set of programs. The real time experiments are based on audio in the applications of audio loopback, delay and echo. Whereas the non-real time experiments are generation of a sine wave, low pass and high pass filter. △ Less

Submitted 13 January, 2020; originally announced January 2020.

arXiv:2001.07342 [pdf, other]

Transfer Learning using Neural Ordinary Differential Equations

Authors: Rajath S, Sumukh Aithal K, Natarajan Subramanyam

Abstract: A concept of using Neural Ordinary Differential Equations(NODE) for Transfer Learning has been introduced. In this paper we use the EfficientNets to explore transfer learning on CIFAR-10 dataset. We use NODE for fine-tuning our model. Using NODE for fine tuning provides more stability during training and validation.These continuous depth blocks can also have a trade off between numerical precision… ▽ More A concept of using Neural Ordinary Differential Equations(NODE) for Transfer Learning has been introduced. In this paper we use the EfficientNets to explore transfer learning on CIFAR-10 dataset. We use NODE for fine-tuning our model. Using NODE for fine tuning provides more stability during training and validation.These continuous depth blocks can also have a trade off between numerical precision and speed .Using Neural ODEs for transfer learning has resulted in much stable convergence of the loss function. △ Less

Submitted 20 January, 2020; originally announced January 2020.

Showing 1–50 of 78 results for author: K., S