-
NIRMAL Pooling: An Adaptive Max Pooling Approach with Non-linear Activation for Enhanced Image Classification
Authors:
Nirmal Gaud,
Krishna Kumar Jha,
Jhimli Adhikari,
Adhini Nasarin P S,
Joydeep Das,
Samarth S Deshpande,
Nitasha Barara,
Vaduguru Venkata Ramya,
Santu Saha,
Mehmet Tarik Baran,
Sarangi Venkateshwarlu,
Anusha M D,
Surej Mouli,
Preeti Katiyar,
Vipin Kumar Chaudhary
Abstract:
This paper presents NIRMAL Pooling, a novel pooling layer for Convolutional Neural Networks (CNNs) that integrates adaptive max pooling with non-linear activation function for image classification tasks. The acronym NIRMAL stands for Non-linear Activation, Intermediate Aggregation, Reduction, Maximum, Adaptive, and Localized. By dynamically adjusting pooling parameters based on desired output dime…
▽ More
This paper presents NIRMAL Pooling, a novel pooling layer for Convolutional Neural Networks (CNNs) that integrates adaptive max pooling with non-linear activation function for image classification tasks. The acronym NIRMAL stands for Non-linear Activation, Intermediate Aggregation, Reduction, Maximum, Adaptive, and Localized. By dynamically adjusting pooling parameters based on desired output dimensions and applying a Rectified Linear Unit (ReLU) activation post-pooling, NIRMAL Pooling improves robustness and feature expressiveness. We evaluated its performance against standard Max Pooling on three benchmark datasets: MNIST Digits, MNIST Fashion, and CIFAR-10. NIRMAL Pooling achieves test accuracies of 99.25% (vs. 99.12% for Max Pooling) on MNIST Digits, 91.59% (vs. 91.44%) on MNIST Fashion, and 70.49% (vs. 68.87%) on CIFAR-10, demonstrating consistent improvements, particularly on complex datasets. This work highlights the potential of NIRMAL Pooling to enhance CNN performance in diverse image recognition tasks, offering a flexible and reliable alternative to traditional pooling methods.
△ Less
Submitted 13 August, 2025;
originally announced August 2025.
-
UGPL: Uncertainty-Guided Progressive Learning for Evidence-Based Classification in Computed Tomography
Authors:
Shravan Venkatraman,
Pavan Kumar S,
Rakesh Raj Madavan,
Chandrakala S
Abstract:
Accurate classification of computed tomography (CT) images is essential for diagnosis and treatment planning, but existing methods often struggle with the subtle and spatially diverse nature of pathological features. Current approaches typically process images uniformly, limiting their ability to detect localized abnormalities that require focused analysis. We introduce UGPL, an uncertainty-guided…
▽ More
Accurate classification of computed tomography (CT) images is essential for diagnosis and treatment planning, but existing methods often struggle with the subtle and spatially diverse nature of pathological features. Current approaches typically process images uniformly, limiting their ability to detect localized abnormalities that require focused analysis. We introduce UGPL, an uncertainty-guided progressive learning framework that performs a global-to-local analysis by first identifying regions of diagnostic ambiguity and then conducting detailed examination of these critical areas. Our approach employs evidential deep learning to quantify predictive uncertainty, guiding the extraction of informative patches through a non-maximum suppression mechanism that maintains spatial diversity. This progressive refinement strategy, combined with an adaptive fusion mechanism, enables UGPL to integrate both contextual information and fine-grained details. Experiments across three CT datasets demonstrate that UGPL consistently outperforms state-of-the-art methods, achieving improvements of 3.29%, 2.46%, and 8.08% in accuracy for kidney abnormality, lung cancer, and COVID-19 detection, respectively. Our analysis shows that the uncertainty-guided component provides substantial benefits, with performance dramatically increasing when the full progressive learning pipeline is implemented. Our code is available at: https://github.com/shravan-18/UGPL
△ Less
Submitted 18 July, 2025;
originally announced July 2025.
-
Generating Dynamic Graph Algorithms for Multiple Backends for a Graph DSL
Authors:
Nibedita Behera,
Ashwina Kumar,
Atharva Chougule,
Mohammed Shan P S,
Rushabh Nirdosh Lalwani,
Rupesh Nasre
Abstract:
With the rapid growth of unstructured and semistructured data, parallelizing graph algorithms has become essential for efficiency. However, due to the inherent irregularity in computation, memory access patterns, and communication, graph algorithms are notoriously difficult to parallelize. To address this challenge, several libraries, frameworks, and domain-specific languages (DSLs) have been prop…
▽ More
With the rapid growth of unstructured and semistructured data, parallelizing graph algorithms has become essential for efficiency. However, due to the inherent irregularity in computation, memory access patterns, and communication, graph algorithms are notoriously difficult to parallelize. To address this challenge, several libraries, frameworks, and domain-specific languages (DSLs) have been proposed to ease the parallel programming burden for domain experts. Existing frameworks partially or fully abstract away parallelism intricacies, provide intuitive scheduling mnemonics, and employ program analysis to identify data races and generate synchronization code. Despite these advances, most frameworks are limited in their abstractions and runtime optimizations, especially when dealing with static graphs. In contrast, many real-world graphs are inherently dynamic, with evolving structures over time through insertions, deletions, and modifications of vertices, edges, and attributes. Generating efficient and correctly synchronized code for such dynamic graph algorithms remains a significant challenge.
In this work, we introduce an abstraction scheme and runtime optimizations for the efficient processing of morph algorithms. Specifically, given an initial graph G and a set of updates $Δ$G involving edge insertions and deletions, we express the dynamic processing logic through a DSL and automatically generate parallel code targeting multicore, distributed, and many-core environments. We demonstrate the effectiveness of our approach by applying the DSL-generated code to ten large graphs with diverse characteristics and three widely used algorithms: Shortest Paths, PageRank, and Triangle Counting.
△ Less
Submitted 15 July, 2025;
originally announced July 2025.
-
Gemini 2.5: Pushing the Frontier with Advanced Reasoning, Multimodality, Long Context, and Next Generation Agentic Capabilities
Authors:
Gheorghe Comanici,
Eric Bieber,
Mike Schaekermann,
Ice Pasupat,
Noveen Sachdeva,
Inderjit Dhillon,
Marcel Blistein,
Ori Ram,
Dan Zhang,
Evan Rosen,
Luke Marris,
Sam Petulla,
Colin Gaffney,
Asaf Aharoni,
Nathan Lintz,
Tiago Cardal Pais,
Henrik Jacobsson,
Idan Szpektor,
Nan-Jiang Jiang,
Krishna Haridasan,
Ahmed Omran,
Nikunj Saunshi,
Dara Bahri,
Gaurav Mishra,
Eric Chu
, et al. (3284 additional authors not shown)
Abstract:
In this report, we introduce the Gemini 2.X model family: Gemini 2.5 Pro and Gemini 2.5 Flash, as well as our earlier Gemini 2.0 Flash and Flash-Lite models. Gemini 2.5 Pro is our most capable model yet, achieving SoTA performance on frontier coding and reasoning benchmarks. In addition to its incredible coding and reasoning skills, Gemini 2.5 Pro is a thinking model that excels at multimodal unde…
▽ More
In this report, we introduce the Gemini 2.X model family: Gemini 2.5 Pro and Gemini 2.5 Flash, as well as our earlier Gemini 2.0 Flash and Flash-Lite models. Gemini 2.5 Pro is our most capable model yet, achieving SoTA performance on frontier coding and reasoning benchmarks. In addition to its incredible coding and reasoning skills, Gemini 2.5 Pro is a thinking model that excels at multimodal understanding and it is now able to process up to 3 hours of video content. Its unique combination of long context, multimodal and reasoning capabilities can be combined to unlock new agentic workflows. Gemini 2.5 Flash provides excellent reasoning abilities at a fraction of the compute and latency requirements and Gemini 2.0 Flash and Flash-Lite provide high performance at low latency and cost. Taken together, the Gemini 2.X model generation spans the full Pareto frontier of model capability vs cost, allowing users to explore the boundaries of what is possible with complex agentic problem solving.
△ Less
Submitted 22 July, 2025; v1 submitted 7 July, 2025;
originally announced July 2025.
-
Two-parameter superposable S-curves
Authors:
Vijay Prakash S
Abstract:
Straight line equation $y=mx$ with slope $m$, when singularly perturbed as $ay^3+y=mx$ with a positive parameter $a$, results in S-shaped curves or S-curves on a real plane. As $a\rightarrow 0$, we get back $y=mx$ which is a cumulative distribution function of a continuous uniform distribution that describes the occurrence of every event in an interval to be equally probable. As…
▽ More
Straight line equation $y=mx$ with slope $m$, when singularly perturbed as $ay^3+y=mx$ with a positive parameter $a$, results in S-shaped curves or S-curves on a real plane. As $a\rightarrow 0$, we get back $y=mx$ which is a cumulative distribution function of a continuous uniform distribution that describes the occurrence of every event in an interval to be equally probable. As $a\rightarrow\infty$, the derivative of $y$ has finite support only at $y=0$ resembling a degenerate distribution. Based on these arguments, in this work, we propose that these S-curves can represent maximum entropy uniform distribution to a zero entropy single value. We also argue that these S-curves are superposable as they are only parametrically nonlinear but fundamentally linear. So far, the superposed forms have been used to capture the patterns of natural systems such as nonlinear dynamics of biological growth and kinetics of enzyme reactions. Here, we attempt to use the S-curve and its superposed form as statistical models. We fit the models on a classical dataset containing flower measurements of iris plants and analyze their usefulness in pattern recognition. Based on these models, we claim that any non-uniform pattern can be represented as a singular perturbation to uniform distribution. However, our parametric estimation procedure have some limitations such as sensitivity to initial conditions depending on the data at hand.
△ Less
Submitted 6 May, 2025; v1 submitted 28 April, 2025;
originally announced April 2025.
-
ForcePose: A Deep Learning Approach for Force Calculation Based on Action Recognition Using MediaPipe Pose Estimation Combined with Object Detection
Authors:
Nandakishor M,
Vrinda Govind V,
Anuradha Puthalath,
Anzy L,
Swathi P S,
Aswathi R,
Devaprabha A R,
Varsha Raj,
Midhuna Krishnan K,
Akhila Anilkumar T V,
Yamuna P V
Abstract:
Force estimation in human-object interactions is crucial for various fields like ergonomics, physical therapy, and sports science. Traditional methods depend on specialized equipment such as force plates and sensors, which makes accurate assessments both expensive and restricted to laboratory settings. In this paper, we introduce ForcePose, a novel deep learning framework that estimates applied fo…
▽ More
Force estimation in human-object interactions is crucial for various fields like ergonomics, physical therapy, and sports science. Traditional methods depend on specialized equipment such as force plates and sensors, which makes accurate assessments both expensive and restricted to laboratory settings. In this paper, we introduce ForcePose, a novel deep learning framework that estimates applied forces by combining human pose estimation with object detection. Our approach leverages MediaPipe for skeletal tracking and SSD MobileNet for object recognition to create a unified representation of human-object interaction. We've developed a specialized neural network that processes both spatial and temporal features to predict force magnitude and direction without needing any physical sensors. After training on our dataset of 850 annotated videos with corresponding force measurements, our model achieves a mean absolute error of 5.83 N in force magnitude and 7.4 degrees in force direction. When compared to existing computer vision approaches, our method performs 27.5% better while still offering real-time performance on standard computing hardware. ForcePose opens up new possibilities for force analysis in diverse real-world scenarios where traditional measurement tools are impractical or intrusive. This paper discusses our methodology, the dataset creation process, evaluation metrics, and potential applications across rehabilitation, ergonomics assessment, and athletic performance analysis.
△ Less
Submitted 28 March, 2025;
originally announced March 2025.
-
Single Shot AI-assisted quantification of KI-67 proliferation index in breast cancer
Authors:
Deepti Madurai Muthu,
Priyanka S,
Lalitha Rani N,
P. G. Kubendran Amos
Abstract:
Reliable quantification of Ki-67, a key proliferation marker in breast cancer, is essential for molecular subtyping and informed treatment planning. Conventional approaches, including visual estimation and manual counting, suffer from interobserver variability and limited reproducibility. This study introduces an AI-assisted method using the YOLOv8 object detection framework for automated Ki-67 sc…
▽ More
Reliable quantification of Ki-67, a key proliferation marker in breast cancer, is essential for molecular subtyping and informed treatment planning. Conventional approaches, including visual estimation and manual counting, suffer from interobserver variability and limited reproducibility. This study introduces an AI-assisted method using the YOLOv8 object detection framework for automated Ki-67 scoring. High-resolution digital images (40x magnification) of immunohistochemically stained tumor sections were captured from Ki-67 hotspot regions and manually annotated by a domain expert to distinguish Ki-67-positive and negative tumor cells. The dataset was augmented and divided into training (80%), validation (10%), and testing (10%) subsets. Among the YOLOv8 variants tested, the Medium model achieved the highest performance, with a mean Average Precision at 50% Intersection over Union (mAP50) exceeding 85% for Ki-67-positive cells. The proposed approach offers an efficient, scalable, and objective alternative to conventional scoring methods, supporting greater consistency in Ki-67 evaluation. Future directions include developing user-friendly clinical interfaces and expanding to multi-institutional datasets to enhance generalizability and facilitate broader adoption in diagnostic practice.
△ Less
Submitted 25 March, 2025;
originally announced March 2025.
-
Transforming Student Evaluation with Adaptive Intelligence and Performance Analytics
Authors:
Pushpalatha K S,
Abhishek Mangalur,
Ketan Hegde,
Chetan Badachi,
Mohammad Aamir
Abstract:
The development in Artificial Intelligence (AI) offers transformative potential for redefining student assessment methodologies. This paper aims to establish the idea of the advancement of Artificial Intelligence (AI) and its prospect in reshaping approaches to assessing students. It creates a system for the evaluation of students performance using Artificial intelligence, and particularly the Gem…
▽ More
The development in Artificial Intelligence (AI) offers transformative potential for redefining student assessment methodologies. This paper aims to establish the idea of the advancement of Artificial Intelligence (AI) and its prospect in reshaping approaches to assessing students. It creates a system for the evaluation of students performance using Artificial intelligence, and particularly the Gemini API for the generation of questions, grading and report on the students performances. This is to facilitate easy use of the tools in creating, scheduling, and delivering assessments with minimal chances of cheating through options such as full screen and time limit. There are formats of questions in the system which comprises multiple choice, short answers and descriptive questions, developed by Gemini. The most conspicuous feature is the self-checking system whereby the user gets instant feedback for the correct score that each of the students would have scored instantly with explanations about wrong answers. Moreover, the platform has intelligent learning progressions where the user will be able to monitor his/her performances to be recommended a certain level of performance. It will allow students as well as educators to have real-time analytics and feedback on what they are good at and where they need to improve. Not only does it make the assessment easier, but it also improves the levels of accuracy in grading and effectively strengthens a data based learning process for students.
△ Less
Submitted 7 February, 2025;
originally announced March 2025.
-
A Novel Quaternary Decoder Design Utilizing 32nm CMOS and GNRFET Technology for Enhanced High-Density Memory Applications
Authors:
Anindita Chattopadhyay,
Pooja Desai,
Vishwas P,
Vasundhara Patel K. S
Abstract:
Multi-Valued Logic (MVL) has more than one logic level defined to represent data whereas binary logic has 2 logic levels. It has been shown that the MVL circuits use the circuit resources more effectively at different voltage levels with less circuitry and greater efficiency. Recently, graphene nano-ribbon field effect transistor (GNRFET) has drawn a lot of interest due to its higher electron mobi…
▽ More
Multi-Valued Logic (MVL) has more than one logic level defined to represent data whereas binary logic has 2 logic levels. It has been shown that the MVL circuits use the circuit resources more effectively at different voltage levels with less circuitry and greater efficiency. Recently, graphene nano-ribbon field effect transistor (GNRFET) has drawn a lot of interest due to its higher electron mobility. This paper presents quaternary decoder implemented in GNRFET and analyzed latency, power, performance etc. also compared the power and delay characteristics of the design implemented both in CMOS and Graphene Nano Ribbon Field Effect Transistor (GNRFET) in the 32nm technology node.
△ Less
Submitted 16 February, 2025;
originally announced February 2025.
-
A Novel Approach using CapsNet and Deep Belief Network for Detection and Identification of Oral Leukopenia
Authors:
Hirthik Mathesh GV,
Kavin Chakravarthy M,
Sentil Pandi S
Abstract:
Oral cancer constitutes a significant global health concern, resulting in 277,484 fatalities in 2023, with the highest prevalence observed in low- and middle-income nations. Facilitating automation in the detection of possibly malignant and malignant lesions in the oral cavity could result in cost-effective and early disease diagnosis. Establishing an extensive repository of meticulously annotated…
▽ More
Oral cancer constitutes a significant global health concern, resulting in 277,484 fatalities in 2023, with the highest prevalence observed in low- and middle-income nations. Facilitating automation in the detection of possibly malignant and malignant lesions in the oral cavity could result in cost-effective and early disease diagnosis. Establishing an extensive repository of meticulously annotated oral lesions is essential. In this research photos are being collected from global clinical experts, who have been equipped with an annotation tool to generate comprehensive labelling. This research presents a novel approach for integrating bounding box annotations from various doctors. Additionally, Deep Belief Network combined with CAPSNET is employed to develop automated systems that extracted intricate patterns to address this challenging problem. This study evaluated two deep learning-based computer vision methodologies for the automated detection and classification of oral lesions to facilitate the early detection of oral cancer: image classification utilizing CAPSNET. Image classification attained an F1 score of 94.23% for detecting photos with lesions 93.46% for identifying images necessitating referral. Object detection attained an F1 score of 89.34% for identifying lesions for referral. Subsequent performances are documented about classification based on the sort of referral decision. Our preliminary findings indicate that deep learning possesses the capability to address this complex problem.
△ Less
Submitted 1 January, 2025;
originally announced January 2025.
-
Real-valued continued fraction of straight lines
Authors:
Vijay Prakash S
Abstract:
In an unbounded plane, straight lines are used extensively for mathematical analysis. They are tools of convenience. However, those with high slope values become unbounded at a faster rate than the independent variable. So, straight lines, in this work, are made to be bounded by introducing a parametric nonlinear term that is positive. The straight lines are transformed into bounded nonlinear curv…
▽ More
In an unbounded plane, straight lines are used extensively for mathematical analysis. They are tools of convenience. However, those with high slope values become unbounded at a faster rate than the independent variable. So, straight lines, in this work, are made to be bounded by introducing a parametric nonlinear term that is positive. The straight lines are transformed into bounded nonlinear curves that become unbounded at a much slower rate than the independent variable. This transforming equation can be expressed as a continued fraction of straight lines. The continued fraction is real-valued and converges to the solutions of the transforming equation. Following Euler's method, the continued fraction has been reduced into an infinite series. The usefulness of the bounding nature of continued fraction is demonstrated by solving the problem of image classification. Parameters estimated on the Fashion-MNIST dataset of greyscale images using continued fraction of regression lines have less variance, converge quickly and are more accurate than the linear counterpart. Moreover, this multi-dimensional parametric estimation problem can be expressed on $xy-$ plane using the parameters of the continued fraction and patterns emerge on planar plots.
△ Less
Submitted 16 December, 2024;
originally announced December 2024.
-
SplatR : Experience Goal Visual Rearrangement with 3D Gaussian Splatting and Dense Feature Matching
Authors:
Arjun P S,
Andrew Melnik,
Gora Chand Nandi
Abstract:
Experience Goal Visual Rearrangement task stands as a foundational challenge within Embodied AI, requiring an agent to construct a robust world model that accurately captures the goal state. The agent uses this world model to restore a shuffled scene to its original configuration, making an accurate representation of the world essential for successfully completing the task. In this work, we presen…
▽ More
Experience Goal Visual Rearrangement task stands as a foundational challenge within Embodied AI, requiring an agent to construct a robust world model that accurately captures the goal state. The agent uses this world model to restore a shuffled scene to its original configuration, making an accurate representation of the world essential for successfully completing the task. In this work, we present a novel framework that leverages on 3D Gaussian Splatting as a 3D scene representation for experience goal visual rearrangement task. Recent advances in volumetric scene representation like 3D Gaussian Splatting, offer fast rendering of high quality and photo-realistic novel views. Our approach enables the agent to have consistent views of the current and the goal setting of the rearrangement task, which enables the agent to directly compare the goal state and the shuffled state of the world in image space. To compare these views, we propose to use a dense feature matching method with visual features extracted from a foundation model, leveraging its advantages of a more universal feature representation, which facilitates robustness, and generalization. We validate our approach on the AI2-THOR rearrangement challenge benchmark and demonstrate improvements over the current state of the art methods
△ Less
Submitted 17 December, 2024; v1 submitted 21 November, 2024;
originally announced November 2024.
-
Targeted Neural Architectures in Multi-Objective Frameworks for Complete Glioma Characterization from Multimodal MRI
Authors:
Shravan Venkatraman,
Pandiyaraju V,
Abeshek A,
Aravintakshan S A,
Pavan Kumar S,
Kannan A,
Madhan S
Abstract:
Brain tumors result from abnormal cell growth in brain tissue. If undiagnosed, they cause neurological deficits, including cognitive impairment, motor dysfunction, and sensory loss. As tumors grow, intracranial pressure increases, potentially leading to fatal complications such as brain herniation. Early diagnosis and treatment are crucial to controlling these effects and slowing tumor progression…
▽ More
Brain tumors result from abnormal cell growth in brain tissue. If undiagnosed, they cause neurological deficits, including cognitive impairment, motor dysfunction, and sensory loss. As tumors grow, intracranial pressure increases, potentially leading to fatal complications such as brain herniation. Early diagnosis and treatment are crucial to controlling these effects and slowing tumor progression. Deep learning (DL) and artificial intelligence (AI) are increasingly used to assist doctors in early diagnosis through magnetic resonance imaging (MRI) scans. Our research proposes targeted neural architectures within multi-objective frameworks that can localize, segment, and classify the grade of these gliomas from multimodal MRI images to solve this critical issue. Our localization framework utilizes a targeted architecture that enhances the LinkNet framework with an encoder inspired by VGG19 for better multimodal feature extraction from the tumor along with spatial and graph attention mechanisms that sharpen feature focus and inter-feature relationships. For the segmentation objective, we deployed a specialized framework using the SeResNet101 CNN model as the encoder backbone integrated into the LinkNet architecture, achieving an IoU Score of 96%. The classification objective is addressed through a distinct framework implemented by combining the SeResNet152 feature extractor with Adaptive Boosting classifier, reaching an accuracy of 98.53%. Our multi-objective approach with targeted neural architectures demonstrated promising results for complete glioma characterization, with the potential to advance medical AI by enabling early diagnosis and providing more accurate treatment options for patients.
△ Less
Submitted 18 March, 2025; v1 submitted 25 September, 2024;
originally announced September 2024.
-
Acoustic Levitation for Environmental Remediation: An Effective Approach for Containment and Forecasting of Oil Spills
Authors:
L Rochit,
Nithish Kumar N,
Devi Priya V S,
Sibi Chakkaravarthy Sethuraman,
Anitha Subramanian
Abstract:
The ocean ecology is badly impacted by large-scale oil spills, plastic waste, and chemical pollution, which destroy ecosystems and endanger marine life. Acknowledging the detrimental effects of oil spills on ecosystems, our research aims to establish the foundation for creative methods to lessen their impact. With an emphasis on the containment and prediction of oil spills, this research investiga…
▽ More
The ocean ecology is badly impacted by large-scale oil spills, plastic waste, and chemical pollution, which destroy ecosystems and endanger marine life. Acknowledging the detrimental effects of oil spills on ecosystems, our research aims to establish the foundation for creative methods to lessen their impact. With an emphasis on the containment and prediction of oil spills, this research investigates the potential of acoustic levitation as a cutting-edge technique for environmental cleanup. Effectively separating and eliminating pollutants without causing additional ecological harm is a major issue for traditional oil spill cleanup techniques. Acoustic levitation provides a non-invasive, accurate, and effective alternative by using sound waves to precisely and subtly separate oil droplets from water in controlled environments. This proposed approach can reduce the negative effects on the environment and increase the efficacy of cleanup efforts. The findings have been examined and assessed by proof of concept experiments with oil droplets, identifying the relationship between the intensity of ultrasonic pressure and the proportion of oil droplets collected.
△ Less
Submitted 3 September, 2024;
originally announced September 2024.
-
Leveraging SeNet and ResNet Synergy within an Encoder-Decoder Architecture for Glioma Detection
Authors:
Pandiyaraju V,
Shravan Venkatraman,
Abeshek A,
Pavan Kumar S,
Aravintakshan S A
Abstract:
Brain tumors are abnormalities that can severely impact a patient's health, leading to life-threatening conditions such as cancer. These can result in various debilitating effects, including neurological issues, cognitive impairment, motor and sensory deficits, as well as emotional and behavioral changes. These symptoms significantly affect a patient's quality of life, making early diagnosis and t…
▽ More
Brain tumors are abnormalities that can severely impact a patient's health, leading to life-threatening conditions such as cancer. These can result in various debilitating effects, including neurological issues, cognitive impairment, motor and sensory deficits, as well as emotional and behavioral changes. These symptoms significantly affect a patient's quality of life, making early diagnosis and timely treatment essential to prevent further deterioration. However, accurately segmenting the tumor region from medical images, particularly MRI scans, is a challenging and time-consuming task that requires the expertise of radiologists. Manual segmentation can also be prone to human errors. To address these challenges, this research leverages the synergy of SeNet and ResNet architectures within an encoder-decoder framework, designed specifically for glioma detection and segmentation. The proposed model incorporates the power of SeResNet-152 as the backbone, integrated into a robust encoder-decoder structure to enhance feature extraction and improve segmentation accuracy. This novel approach significantly reduces the dependency on manual tasks and improves the precision of tumor identification. Evaluation of the model demonstrates strong performance, achieving 87% in Dice Coefficient, 89.12% in accuracy, 88% in IoU score, and 82% in mean IoU score, showcasing its effectiveness in tackling the complex problem of brain tumor segmentation.
△ Less
Submitted 1 September, 2024;
originally announced September 2024.
-
Colaboot: A Cloud-based Diskless PC Booting Mechanism
Authors:
Aditya Mitra,
Anisha Ghosh,
Sibi Chakkaravarthy Sethuraman,
Devi Priya V S
Abstract:
Recent increases in endpoint-based security events and threats compelled enterprise operations to switch to virtual desktop infrastructure and web-based applications. In addition to reducing potential hazards, this has guaranteed a consistent desktop environment for every user. On the other hand, the attack surface is greatly increased because all endpoints are connected to the company network, wh…
▽ More
Recent increases in endpoint-based security events and threats compelled enterprise operations to switch to virtual desktop infrastructure and web-based applications. In addition to reducing potential hazards, this has guaranteed a consistent desktop environment for every user. On the other hand, the attack surface is greatly increased because all endpoints are connected to the company network, which could harbor malware and other advanced persistent threats. This results in a considerable loss of system resources on each individual endpoint. Hence our work proposes a standard called Colaboot that enables machines throughout a company to boot from a single operating system in order to address these problems and guarantee a consistent operating system environment that could be easily updated to the most recent security patches across all work stations.
△ Less
Submitted 30 August, 2024;
originally announced August 2024.
-
Deep Learning at the Intersection: Certified Robustness as a Tool for 3D Vision
Authors:
Gabriel Pérez S,
Juan C. Pérez,
Motasem Alfarra,
Jesús Zarzar,
Sara Rojas,
Bernard Ghanem,
Pablo Arbeláez
Abstract:
This paper presents preliminary work on a novel connection between certified robustness in machine learning and the modeling of 3D objects. We highlight an intriguing link between the Maximal Certified Radius (MCR) of a classifier representing a space's occupancy and the space's Signed Distance Function (SDF). Leveraging this relationship, we propose to use the certification method of randomized s…
▽ More
This paper presents preliminary work on a novel connection between certified robustness in machine learning and the modeling of 3D objects. We highlight an intriguing link between the Maximal Certified Radius (MCR) of a classifier representing a space's occupancy and the space's Signed Distance Function (SDF). Leveraging this relationship, we propose to use the certification method of randomized smoothing (RS) to compute SDFs. Since RS' high computational cost prevents its practical usage as a way to compute SDFs, we propose an algorithm to efficiently run RS in low-dimensional applications, such as 3D space, by expressing RS' fundamental operations as Gaussian smoothing on pre-computed voxel grids. Our approach offers an innovative and practical tool to compute SDFs, validated through proof-of-concept experiments in novel view synthesis. This paper bridges two previously disparate areas of machine learning, opening new avenues for further exploration and potential cross-domain advancements.
△ Less
Submitted 23 August, 2024;
originally announced August 2024.
-
Panorama Tomosynthesis from Head CBCT with Simulated Projection Geometry
Authors:
Anusree P. S.,
Bikram Keshari Parida,
Seong Yong Moon,
Wonsang You
Abstract:
Cone Beam Computed Tomography (CBCT) and Panoramic X-rays are the most commonly used imaging modalities in dental health care. CBCT can produce three-dimensional views of a patient's head, providing clinicians with better diagnostic capability, whereas Panoramic X-ray can capture the entire maxillofacial region in a single image. If the CBCT is already available, it can be beneficial to synthesize…
▽ More
Cone Beam Computed Tomography (CBCT) and Panoramic X-rays are the most commonly used imaging modalities in dental health care. CBCT can produce three-dimensional views of a patient's head, providing clinicians with better diagnostic capability, whereas Panoramic X-ray can capture the entire maxillofacial region in a single image. If the CBCT is already available, it can be beneficial to synthesize a Panoramic X-ray, thereby avoiding an immediate additional scan and extra radiation exposure. Existing methods focus on delineating an approximate dental arch and creating orthogonal projections along this arch. However, no golden standard is available for such dental arch extractions, and this choice can affect the quality of synthesized X-rays. To avoid such issues, we propose a novel method for synthesizing Panoramic X-rays from diverse head CBCTs, employing a simulated projection geometry and dynamic rotation centers. Our method effectively synthesized panoramic views from CBCT, even for patients with missing or nonexistent teeth and in the presence of severe metal implants. Our results demonstrate that this method can generate high-quality panoramic images irrespective of the CBCT scanner geometry.
△ Less
Submitted 20 August, 2024; v1 submitted 18 August, 2024;
originally announced August 2024.
-
A Channel Attention-Driven Hybrid CNN Framework for Paddy Leaf Disease Detection
Authors:
Pandiyaraju V,
Shravan Venkatraman,
Abeshek A,
Pavan Kumar S,
Aravintakshan S A,
Senthil Kumar A M,
Kannan A
Abstract:
Farmers face various challenges when it comes to identifying diseases in rice leaves during their early stages of growth, which is a major reason for poor produce. Therefore, early and accurate disease identification is important in agriculture to avoid crop loss and improve cultivation. In this research, we propose a novel hybrid deep learning (DL) classifier designed by extending the Squeeze-and…
▽ More
Farmers face various challenges when it comes to identifying diseases in rice leaves during their early stages of growth, which is a major reason for poor produce. Therefore, early and accurate disease identification is important in agriculture to avoid crop loss and improve cultivation. In this research, we propose a novel hybrid deep learning (DL) classifier designed by extending the Squeeze-and-Excitation network architecture with a channel attention mechanism and the Swish ReLU activation function. The channel attention mechanism in our proposed model identifies the most important feature channels required for classification during feature extraction and selection. The dying ReLU problem is mitigated by utilizing the Swish ReLU activation function, and the Squeeze-andExcitation blocks improve information propagation and cross-channel interaction. Upon evaluation, our model achieved a high F1-score of 99.76% and an accuracy of 99.74%, surpassing the performance of existing models. These outcomes demonstrate the potential of state-of-the-art DL techniques in agriculture, contributing to the advancement of more efficient and reliable disease detection systems.
△ Less
Submitted 16 July, 2024;
originally announced July 2024.
-
Leveraging Bi-Focal Perspectives and Granular Feature Integration for Accurate Reliable Early Alzheimer's Detection
Authors:
Shravan Venkatraman,
Pandiyaraju V,
Abeshek A,
Pavan Kumar S,
Aravintakshan S A
Abstract:
Being the most commonly known neurodegeneration, Alzheimer's Disease (AD) is annually diagnosed in millions of patients. The present medical scenario still finds the exact diagnosis and classification of AD through neuroimaging data as a challenging task. Traditional CNNs can extract a good amount of low-level information in an image while failing to extract high-level minuscule particles, which i…
▽ More
Being the most commonly known neurodegeneration, Alzheimer's Disease (AD) is annually diagnosed in millions of patients. The present medical scenario still finds the exact diagnosis and classification of AD through neuroimaging data as a challenging task. Traditional CNNs can extract a good amount of low-level information in an image while failing to extract high-level minuscule particles, which is a significant challenge in detecting AD from MRI scans. To overcome this, we propose a novel Granular Feature Integration method to combine information extraction at different scales along with an efficient information flow, enabling the model to capture both broad and fine-grained features simultaneously. We also propose a Bi-Focal Perspective mechanism to highlight the subtle neurofibrillary tangles and amyloid plaques in the MRI scans, ensuring that critical pathological markers are accurately identified. Our model achieved an F1-Score of 99.31%, precision of 99.24%, and recall of 99.51%. These scores prove that our model is significantly better than the state-of-the-art (SOTA) CNNs in existence.
△ Less
Submitted 18 March, 2025; v1 submitted 15 July, 2024;
originally announced July 2024.
-
Exploiting Precision Mapping and Component-Specific Feature Enhancement for Breast Cancer Segmentation and Identification
Authors:
Pandiyaraju V,
Shravan Venkatraman,
Pavan Kumar S,
Santhosh Malarvannan,
Kannan A
Abstract:
Breast cancer is one of the leading causes of death globally, and thus there is an urgent need for early and accurate diagnostic techniques. Although ultrasound imaging is a widely used technique for breast cancer screening, it faces challenges such as poor boundary delineation caused by variations in tumor morphology and reduced diagnostic accuracy due to inconsistent image quality. To address th…
▽ More
Breast cancer is one of the leading causes of death globally, and thus there is an urgent need for early and accurate diagnostic techniques. Although ultrasound imaging is a widely used technique for breast cancer screening, it faces challenges such as poor boundary delineation caused by variations in tumor morphology and reduced diagnostic accuracy due to inconsistent image quality. To address these challenges, we propose novel Deep Learning (DL) frameworks for breast lesion segmentation and classification. We introduce a precision mapping mechanism (PMM) for a precision mapping and attention-driven LinkNet (PMAD-LinkNet) segmentation framework that dynamically adapts spatial mappings through morphological variation analysis, enabling precise pixel-level refinement of tumor boundaries. Subsequently, we introduce a component-specific feature enhancement module (CSFEM) for a component-specific feature-enhanced classifier (CSFEC-Net). Through a multi-level attention approach, the CSFEM magnifies distinguishing features of benign, malignant, and normal tissues. The proposed frameworks are evaluated against existing literature and a diverse set of state-of-the-art Convolutional Neural Network (CNN) architectures. The obtained results show that our segmentation model achieves an accuracy of 98.1%, an IoU of 96.9%, and a Dice Coefficient of 97.2%. For the classification model, an accuracy of 99.2% is achieved with F1-score, precision, and recall values of 99.1%, 99.3%, and 99.1%, respectively.
△ Less
Submitted 10 February, 2025; v1 submitted 3 July, 2024;
originally announced July 2024.
-
EnterpriseEM: Fine-tuned Embeddings for Enterprise Semantic Search
Authors:
Kamalkumar Rathinasamy,
Jayarama Nettar,
Amit Kumar,
Vishal Manchanda,
Arun Vijayakumar,
Ayush Kataria,
Venkateshprasanna Manjunath,
Chidambaram GS,
Jaskirat Singh Sodhi,
Shoeb Shaikh,
Wasim Akhtar Khan,
Prashant Singh,
Tanishq Dattatray Ige,
Vipin Tiwari,
Rajab Ali Mondal,
Harshini K,
S Reka,
Chetana Amancharla,
Faiz ur Rahman,
Harikrishnan P A,
Indraneel Saha,
Bhavya Tiwary,
Navin Shankar Patel,
Pradeep T S,
Balaji A J
, et al. (2 additional authors not shown)
Abstract:
Enterprises grapple with the significant challenge of managing proprietary unstructured data, hindering efficient information retrieval. This has led to the emergence of AI-driven information retrieval solutions, designed to adeptly extract relevant insights to address employee inquiries. These solutions often leverage pre-trained embedding models and generative models as foundational components.…
▽ More
Enterprises grapple with the significant challenge of managing proprietary unstructured data, hindering efficient information retrieval. This has led to the emergence of AI-driven information retrieval solutions, designed to adeptly extract relevant insights to address employee inquiries. These solutions often leverage pre-trained embedding models and generative models as foundational components. While pre-trained embeddings may exhibit proximity or disparity based on their original training objectives, they might not fully align with the unique characteristics of enterprise-specific data, leading to suboptimal alignment with the retrieval goals of enterprise environments. In this paper, we propose a comprehensive methodology for contextualizing pre-trained embedding models to enterprise environments, covering the entire process from data preparation to model fine-tuning and evaluation. By adapting the embeddings to better suit the retrieval tasks prevalent in enterprises, we aim to enhance the performance of information retrieval solutions. We discuss the process of fine-tuning, its effect on retrieval accuracy, and the potential benefits for enterprise information management. Our findings demonstrate the efficacy of fine-tuned embedding models in improving the precision and relevance of search results in enterprise settings.
△ Less
Submitted 27 September, 2024; v1 submitted 18 May, 2024;
originally announced June 2024.
-
Cognitive Planning for Object Goal Navigation using Generative AI Models
Authors:
Arjun P S,
Andrew Melnik,
Gora Chand Nandi
Abstract:
Recent advancements in Generative AI, particularly in Large Language Models (LLMs) and Large Vision-Language Models (LVLMs), offer new possibilities for integrating cognitive planning into robotic systems. In this work, we present a novel framework for solving the object goal navigation problem that generates efficient exploration strategies. Our approach enables a robot to navigate unfamiliar env…
▽ More
Recent advancements in Generative AI, particularly in Large Language Models (LLMs) and Large Vision-Language Models (LVLMs), offer new possibilities for integrating cognitive planning into robotic systems. In this work, we present a novel framework for solving the object goal navigation problem that generates efficient exploration strategies. Our approach enables a robot to navigate unfamiliar environments by leveraging LLMs and LVLMs to understand the semantic structure of the scene. To address the challenge of representing complex environments without overwhelming the system, we propose a 3D modular scene representation, enriched with semantic descriptions. This representation is dynamically pruned using an LLM-based mechanism, which filters irrelevant information and focuses on task-specific data. By combining these elements, our system generates high-level sub-goals that guide the exploration of the robot toward the target object. We validate our approach in simulated environments, demonstrating its ability to enhance object search efficiency while maintaining scalability in complex settings.
△ Less
Submitted 5 November, 2024; v1 submitted 30 March, 2024;
originally announced April 2024.
-
SEGAA: A Unified Approach to Predicting Age, Gender, and Emotion in Speech
Authors:
Aron R,
Indra Sigicharla,
Chirag Periwal,
Mohanaprasad K,
Nithya Darisini P S,
Sourabh Tiwari,
Shivani Arora
Abstract:
The interpretation of human voices holds importance across various applications. This study ventures into predicting age, gender, and emotion from vocal cues, a field with vast applications. Voice analysis tech advancements span domains, from improving customer interactions to enhancing healthcare and retail experiences. Discerning emotions aids mental health, while age and gender detection are vi…
▽ More
The interpretation of human voices holds importance across various applications. This study ventures into predicting age, gender, and emotion from vocal cues, a field with vast applications. Voice analysis tech advancements span domains, from improving customer interactions to enhancing healthcare and retail experiences. Discerning emotions aids mental health, while age and gender detection are vital in various contexts. Exploring deep learning models for these predictions involves comparing single, multi-output, and sequential models highlighted in this paper. Sourcing suitable data posed challenges, resulting in the amalgamation of the CREMA-D and EMO-DB datasets. Prior work showed promise in individual predictions, but limited research considered all three variables simultaneously. This paper identifies flaws in an individual model approach and advocates for our novel multi-output learning architecture Speech-based Emotion Gender and Age Analysis (SEGAA) model. The experiments suggest that Multi-output models perform comparably to individual models, efficiently capturing the intricate relationships between variables and speech inputs, all while achieving improved runtime.
△ Less
Submitted 1 March, 2024;
originally announced March 2024.
-
Crowdsourcing Dermatology Images with Google Search Ads: Creating a Real-World Skin Condition Dataset
Authors:
Abbi Ward,
Jimmy Li,
Julie Wang,
Sriram Lakshminarasimhan,
Ashley Carrick,
Bilson Campana,
Jay Hartford,
Pradeep Kumar S,
Tiya Tiyasirichokchai,
Sunny Virmani,
Renee Wong,
Yossi Matias,
Greg S. Corrado,
Dale R. Webster,
Dawn Siegel,
Steven Lin,
Justin Ko,
Alan Karthikesalingam,
Christopher Semturs,
Pooja Rao
Abstract:
Background: Health datasets from clinical sources do not reflect the breadth and diversity of disease in the real world, impacting research, medical education, and artificial intelligence (AI) tool development. Dermatology is a suitable area to develop and test a new and scalable method to create representative health datasets.
Methods: We used Google Search advertisements to invite contribution…
▽ More
Background: Health datasets from clinical sources do not reflect the breadth and diversity of disease in the real world, impacting research, medical education, and artificial intelligence (AI) tool development. Dermatology is a suitable area to develop and test a new and scalable method to create representative health datasets.
Methods: We used Google Search advertisements to invite contributions to an open access dataset of images of dermatology conditions, demographic and symptom information. With informed contributor consent, we describe and release this dataset containing 10,408 images from 5,033 contributions from internet users in the United States over 8 months starting March 2023. The dataset includes dermatologist condition labels as well as estimated Fitzpatrick Skin Type (eFST) and Monk Skin Tone (eMST) labels for the images.
Results: We received a median of 22 submissions/day (IQR 14-30). Female (66.72%) and younger (52% < age 40) contributors had a higher representation in the dataset compared to the US population, and 32.6% of contributors reported a non-White racial or ethnic identity. Over 97.5% of contributions were genuine images of skin conditions. Dermatologist confidence in assigning a differential diagnosis increased with the number of available variables, and showed a weaker correlation with image sharpness (Spearman's P values <0.001 and 0.01 respectively). Most contributions were short-duration (54% with onset < 7 days ago ) and 89% were allergic, infectious, or inflammatory conditions. eFST and eMST distributions reflected the geographical origin of the dataset. The dataset is available at github.com/google-research-datasets/scin .
Conclusion: Search ads are effective at crowdsourcing images of health conditions. The SCIN dataset bridges important gaps in the availability of representative images of common skin conditions.
△ Less
Submitted 28 February, 2024;
originally announced February 2024.
-
List Coloring of some Cayley graphs using Kernel perfections
Authors:
Prajnanaswaroopa S
Abstract:
In this paper, we try to determine exact or bounds on the choosability, or list chromatic numbers of some Cayley graphs, typically some Unitary Cayley graphs and Cayley graphs on Dihedral groups.
In this paper, we try to determine exact or bounds on the choosability, or list chromatic numbers of some Cayley graphs, typically some Unitary Cayley graphs and Cayley graphs on Dihedral groups.
△ Less
Submitted 25 February, 2024;
originally announced February 2024.
-
Modified RRT* for Path Planning in Autonomous Driving
Authors:
Sugirtha T,
Pranav S,
Nitin Benjamin Dasiah,
Sridevi M
Abstract:
Essential tasks in autonomous driving includes environment perception, detection and tracking, path planning and action control. This paper focus on path planning, which is one of the challenging task as it needs to find optimal path in highly complex and dynamic environments. Usually, a driving scenario has large number of obstacles in their route. In this paper, we propose a two-stage path plann…
▽ More
Essential tasks in autonomous driving includes environment perception, detection and tracking, path planning and action control. This paper focus on path planning, which is one of the challenging task as it needs to find optimal path in highly complex and dynamic environments. Usually, a driving scenario has large number of obstacles in their route. In this paper, we propose a two-stage path planning algorithm named Angle-based Directed Rapidly exploring Random Trees (AD-RRT*) to address the problem of optimal path in complex environment. The proposed algorithm uses A* algorithm for global path planning and modifies RRT* to bound the samples using angle. The efficiency of the proposed algorithm is evaluated through experiments in different scenarios based on the location and number of obstacles. The proposed algorithm showed higher rate of convergence with reduced time and less number of nodes than the base RRT* algorithm.
△ Less
Submitted 19 February, 2024;
originally announced February 2024.
-
Tandem Transformers for Inference Efficient LLMs
Authors:
Aishwarya P S,
Pranav Ajit Nair,
Yashas Samaga,
Toby Boyd,
Sanjiv Kumar,
Prateek Jain,
Praneeth Netrapalli
Abstract:
The autoregressive nature of conventional large language models (LLMs) inherently limits inference speed, as tokens are generated sequentially. While speculative and parallel decoding techniques attempt to mitigate this, they face limitations: either relying on less accurate smaller models for generation or failing to fully leverage the base LLM's representations.
We introduce a novel architectu…
▽ More
The autoregressive nature of conventional large language models (LLMs) inherently limits inference speed, as tokens are generated sequentially. While speculative and parallel decoding techniques attempt to mitigate this, they face limitations: either relying on less accurate smaller models for generation or failing to fully leverage the base LLM's representations.
We introduce a novel architecture, Tandem transformers, to address these issues. This architecture uniquely combines (1) a small autoregressive model and (2) a large model operating in block mode (processing multiple tokens simultaneously). The small model's predictive accuracy is substantially enhanced by granting it attention to the large model's richer representations. On the PaLM2 pretraining dataset, a tandem of PaLM2-Bison and PaLM2-Gecko demonstrates a 3.3% improvement in next-token prediction accuracy over a standalone PaLM2-Gecko, offering a 1.16x speedup compared to a PaLM2-Otter model with comparable downstream performance. We further incorporate the tandem model within the speculative decoding (SPEED) framework where the large model validates tokens from the small model. This ensures that the Tandem of PaLM2-Bison and PaLM2-Gecko achieves substantial speedup (around 1.14x faster than using vanilla PaLM2-Gecko in SPEED) while maintaining identical downstream task accuracy.
△ Less
Submitted 20 October, 2024; v1 submitted 13 February, 2024;
originally announced February 2024.
-
Learning-Augmented K-Means Clustering Using Dimensional Reduction
Authors:
Issam K. O Jabari,
Shofiyah,
Pradiptya Kahvi S,
Novi Nur Putriwijaya,
Novanto Yudistira
Abstract:
Learning augmented is a machine learning concept built to improve the performance of a method or model, such as enhancing its ability to predict and generalize data or features, or testing the reliability of the method by introducing noise and other factors. On the other hand, clustering is a fundamental aspect of data analysis and has long been used to understand the structure of large datasets.…
▽ More
Learning augmented is a machine learning concept built to improve the performance of a method or model, such as enhancing its ability to predict and generalize data or features, or testing the reliability of the method by introducing noise and other factors. On the other hand, clustering is a fundamental aspect of data analysis and has long been used to understand the structure of large datasets. Despite its long history, the k-means algorithm still faces challenges. One approach, as suggested by Ergun et al,is to use a predictor to minimize the sum of squared distances between each data point and a specified centroid. However, it is known that the computational cost of this algorithm increases with the value of k, and it often gets stuck in local minima. In response to these challenges, we propose a solution to reduce the dimensionality of the dataset using Principal Component Analysis (PCA). It is worth noting that when using k values of 10 and 25, the proposed algorithm yields lower cost results compared to running it without PCA. "Principal component analysis (PCA) is the problem of fitting a low-dimensional affine subspace to a set of data points in a high-dimensional space. PCA is well-established in the literature and has become one of the most useful tools for data modeling, compression, and visualization."
△ Less
Submitted 6 January, 2024;
originally announced January 2024.
-
Imbalanced Data Stream Classification using Dynamic Ensemble Selection
Authors:
Priya. S,
Haribharathi Sivakumar,
Vijay Arvind. R
Abstract:
Modern streaming data categorization faces significant challenges from concept drift and class imbalanced data. This negatively impacts the output of the classifier, leading to improper classification. Furthermore, other factors such as the overlapping of multiple classes limit the extent of the correctness of the output. This work proposes a novel framework for integrating data pre-processing and…
▽ More
Modern streaming data categorization faces significant challenges from concept drift and class imbalanced data. This negatively impacts the output of the classifier, leading to improper classification. Furthermore, other factors such as the overlapping of multiple classes limit the extent of the correctness of the output. This work proposes a novel framework for integrating data pre-processing and dynamic ensemble selection, by formulating the classification framework for the nonstationary drifting imbalanced data stream, which employs the data pre-processing and dynamic ensemble selection techniques. The proposed framework was evaluated using six artificially generated data streams with differing imbalance ratios in combination with two different types of concept drifts. Each stream is composed of 200 chunks of 500 objects described by eight features and contains five concept drifts. Seven pre-processing techniques and two dynamic ensemble selection methods were considered. According to experimental results, data pre-processing combined with Dynamic Ensemble Selection techniques significantly delivers more accuracy when dealing with imbalanced data streams.
△ Less
Submitted 28 September, 2023; v1 submitted 17 September, 2023;
originally announced September 2023.
-
Eigenvalues of some classes of signed complete graphs
Authors:
Prajnanaswaroopa S
Abstract:
In this work, we discuss some properties of the eigenvalues of some classes of signed complete graphs. We also obtain the form of characteristic polynomial for these graphs.
In this work, we discuss some properties of the eigenvalues of some classes of signed complete graphs. We also obtain the form of characteristic polynomial for these graphs.
△ Less
Submitted 9 September, 2023;
originally announced September 2023.
-
A Novel SLCA-UNet Architecture for Automatic MRI Brain Tumor Segmentation
Authors:
Tejashwini P S,
Thriveni J,
Venugopal K R
Abstract:
Brain tumor is deliberated as one of the severe health complications which lead to decrease in life expectancy of the individuals and is also considered as a prominent cause of mortality worldwide. Therefore, timely detection and prediction of brain tumors can be helpful to prevent death rates due to brain tumors. Biomedical image analysis is a widely known solution to diagnose brain tumor. Althou…
▽ More
Brain tumor is deliberated as one of the severe health complications which lead to decrease in life expectancy of the individuals and is also considered as a prominent cause of mortality worldwide. Therefore, timely detection and prediction of brain tumors can be helpful to prevent death rates due to brain tumors. Biomedical image analysis is a widely known solution to diagnose brain tumor. Although MRI is the current standard method for imaging tumors, its clinical usefulness is constrained by the requirement of manual segmentation which is time-consuming. Deep learning-based approaches have emerged as a promising solution to develop automated biomedical image exploration tools and the UNet architecture is commonly used for segmentation. However, the traditional UNet has limitations in terms of complexity, training, accuracy, and contextual information processing. As a result, the modified UNet architecture, which incorporates residual dense blocks, layered attention, and channel attention modules, in addition to stacked convolution, can effectively capture both coarse and fine feature information. The proposed SLCA UNet approach achieves good performance on the freely accessible Brain Tumor Segmentation (BraTS) dataset, with an average performance of 0.845, 0.845, 0.999, and 8.1 in terms of Dice, Sensitivity, Specificity, and Hausdorff95 for BraTS 2020 dataset, respectively.
△ Less
Submitted 16 July, 2023;
originally announced July 2023.
-
Enhancing Room Security and Automating Class Attendance Using ID Cards
Authors:
Shravan Bhat,
Nithin R,
Pranav S
Abstract:
With the rapid advancements in technology, automation has emerged as the future of human endeavors. From simple tasks like attendance management to complex security systems, automation has the potential to revolutionize various aspects of our lives. This research paper explores the implementation of a method aimed at enhancing room security in hostels and automating class attendance using ID cards…
▽ More
With the rapid advancements in technology, automation has emerged as the future of human endeavors. From simple tasks like attendance management to complex security systems, automation has the potential to revolutionize various aspects of our lives. This research paper explores the implementation of a method aimed at enhancing room security in hostels and automating class attendance using ID cards. In this study, we propose a system that utilizes the unique identity information stored in ID cards for various security and check-in tasks. By integrating RFID (Radio-Frequency Identification) reader technology, GSM modules, Node MCU, and Arduino, we create a comprehensive solution. The RFID reader scans the ID card, extracting the relevant information and verifying the user's identity. The data is then transmitted via the GSM module to a central database, ensuring real-time monitoring and security measures. Moreover, the system also enables the automation of class attendance. By utilizing the same ID cards, students can simply tap their cards on a reader placed in the classroom. This information is recorded automatically, eliminating the need for manual attendance taking and reducing errors and time consumption. This research project highlights the practical implementation of ID card technology to enhance room security in hostels and automate class attendance processes. By leveraging the power of automation, we aim to streamline administrative tasks, improve security measures, and optimize efficiency in educational institutions and other relevant settings.
△ Less
Submitted 8 July, 2023;
originally announced July 2023.
-
A Framework for Securing Health Information Using Blockchain in Cloud Hosted Cyber Physical Systems
Authors:
Aisha Banu,
Sharon Priya S,
Poojitha K,
Kiruthiga R,
Ruby Annette,
Subash Chandran
Abstract:
Electronic Health Records (EHRs) have undergone numerous technical improvements in recent years, including the incorporation of mobile devices with the cloud computing technologies to facilitate medical data exchanges between patients and the healthcare professionals. This cutting-edge architecture enables cyber physical systems housed in the cloud to provide healthcare services with minimal opera…
▽ More
Electronic Health Records (EHRs) have undergone numerous technical improvements in recent years, including the incorporation of mobile devices with the cloud computing technologies to facilitate medical data exchanges between patients and the healthcare professionals. This cutting-edge architecture enables cyber physical systems housed in the cloud to provide healthcare services with minimal operational costs, high flexibility, security, and EHR accessibility. If patient health information is stored in the hospital database, there will always be a risk of intrusion, i.e., unauthorized file access and information modification by attackers. To address this concern, we propose a decentralized EHR system based on Blockchain technology. To facilitate secure EHR exchange across various patients and medical providers, we develop a reliable access control method based on smart contracts. We incorporate Cryptocurrency, specifically Ethereum, in the suggested system to protect sensitive health information from potential attackers. In our suggested approach, both physicians and patients are required to be authenticated. Patients can register, and a block with a unique hash value will be generated. Once the patient discusses the disease with the physician, the physician can check the patient's condition and offer drugs. For experimental findings, we employ the public Block chain Ganache and solidity remix-based smart contracts to protect privacy. Ethers are used as the crypto currencies.
△ Less
Submitted 25 June, 2023;
originally announced June 2023.
-
Colorings of some Cayley graphs
Authors:
Prajnanaswaroopa S
Abstract:
Cayley graphs are graphs on algebraic structures, typically groups or group-like structures. In this paper, we have obtained a few results on Cayley graphs on Cyclic groups, powers of cycles, Cayley graphs on some non-abelian groups, and vertex, edge and total colorings of Cayley graphs on gyrogroups.
Cayley graphs are graphs on algebraic structures, typically groups or group-like structures. In this paper, we have obtained a few results on Cayley graphs on Cyclic groups, powers of cycles, Cayley graphs on some non-abelian groups, and vertex, edge and total colorings of Cayley graphs on gyrogroups.
△ Less
Submitted 22 August, 2023; v1 submitted 19 May, 2023;
originally announced May 2023.
-
Taxonomy of AISecOps Threat Modeling for Cloud Based Medical Chatbots
Authors:
Ruby Annette J,
Aisha Banu,
Sharon Priya S,
Subash Chandran
Abstract:
Artificial Intelligence (AI) is playing a vital role in all aspects of technology including cyber security. Application of Conversational AI like the chatbots are also becoming very popular in the medical field to provide timely and immediate medical assistance to patients in need. As medical chatbots deal with a lot of sensitive information, the security of these chatbots is crucial. To secure th…
▽ More
Artificial Intelligence (AI) is playing a vital role in all aspects of technology including cyber security. Application of Conversational AI like the chatbots are also becoming very popular in the medical field to provide timely and immediate medical assistance to patients in need. As medical chatbots deal with a lot of sensitive information, the security of these chatbots is crucial. To secure the confidentiality, integrity, and availability of cloud-hosted assets like these, medical chatbots can be monitored using AISecOps (Artificial Intelligence for Secure IT Operations). AISecOPs is an emerging field that integrates three different but interrelated domains like the IT operation, AI, and security as one domain, where the expertise from all these three domains are used cohesively to secure the cyber assets. It considers cloud operations and security in a holistic framework to collect the metrics required to assess the security threats and train the AI models to take immediate actions. This work is focused on applying the STRIDE threat modeling framework to model the possible threats involved in each component of the chatbot to enable the automatic threat detection using the AISecOps techniques. This threat modeling framework is tailored to the medical chatbots that involves sensitive data sharing but could also be applied for chatbots used in other sectors like the financial services, public sector, and government sectors that are concerned with security and compliance.
△ Less
Submitted 17 May, 2023;
originally announced May 2023.
-
Medical Data Asset Management and an Approach for Disease Prediction using Blockchain and Machine Learning
Authors:
Shruthi K,
Poornima A. S
Abstract:
In the present medical services, the board, clinical well-being records are as electronic clinical record (EHR/EMR) frameworks. These frameworks store patients' clinical histories in a computerized design. Notwithstanding, a patient's clinical information is gained in a productive and ideal way and is demonstrated to be troublesome through these records. Powerlessness constantly prevents the well-…
▽ More
In the present medical services, the board, clinical well-being records are as electronic clinical record (EHR/EMR) frameworks. These frameworks store patients' clinical histories in a computerized design. Notwithstanding, a patient's clinical information is gained in a productive and ideal way and is demonstrated to be troublesome through these records. Powerlessness constantly prevents the well-being of the board from getting data, less use of data obtained, unmanageable protection controls, and unfortunate information resource security. In this paper, we present an effective and safe clinical information resource, the executives' framework involving Blockchain, to determine these issues. Blockchain innovation facilitates the openness of all such records by keeping a block for each patient. This paper proposes an engineering utilizing an off-chain arrangement that will empower specialists and patients to get records in a protected manner. Blockchain makes clinical records permanent and scrambles them for information honesty. Clients can notice their well-being records, yet just patients own the confidential key and can impart it to those they want.
Smart contracts likewise help our information proprietors to deal with their information access in a permission way. The eventual outcome will be seen as a web and portable connection point to get to, identify, and guarantee high-security information handily. In this adventure, we will give deals with any consequences regarding the issues associated with clinical consideration data and the chiefs using AI and Blockchain. Removing only the imperative information from the data is possible with the use of AI. This is done using arranged estimations. At the point when this data is taken care of, the accompanying issue is information sharing and its constancy.
△ Less
Submitted 27 April, 2023;
originally announced May 2023.
-
CoNIC Challenge: Pushing the Frontiers of Nuclear Detection, Segmentation, Classification and Counting
Authors:
Simon Graham,
Quoc Dang Vu,
Mostafa Jahanifar,
Martin Weigert,
Uwe Schmidt,
Wenhua Zhang,
Jun Zhang,
Sen Yang,
Jinxi Xiang,
Xiyue Wang,
Josef Lorenz Rumberger,
Elias Baumann,
Peter Hirsch,
Lihao Liu,
Chenyang Hong,
Angelica I. Aviles-Rivero,
Ayushi Jain,
Heeyoung Ahn,
Yiyu Hong,
Hussam Azzuni,
Min Xu,
Mohammad Yaqub,
Marie-Claire Blache,
Benoît Piégu,
Bertrand Vernay
, et al. (64 additional authors not shown)
Abstract:
Nuclear detection, segmentation and morphometric profiling are essential in helping us further understand the relationship between histology and patient outcome. To drive innovation in this area, we setup a community-wide challenge using the largest available dataset of its kind to assess nuclear segmentation and cellular composition. Our challenge, named CoNIC, stimulated the development of repro…
▽ More
Nuclear detection, segmentation and morphometric profiling are essential in helping us further understand the relationship between histology and patient outcome. To drive innovation in this area, we setup a community-wide challenge using the largest available dataset of its kind to assess nuclear segmentation and cellular composition. Our challenge, named CoNIC, stimulated the development of reproducible algorithms for cellular recognition with real-time result inspection on public leaderboards. We conducted an extensive post-challenge analysis based on the top-performing models using 1,658 whole-slide images of colon tissue. With around 700 million detected nuclei per model, associated features were used for dysplasia grading and survival analysis, where we demonstrated that the challenge's improvement over the previous state-of-the-art led to significant boosts in downstream performance. Our findings also suggest that eosinophils and neutrophils play an important role in the tumour microevironment. We release challenge models and WSI-level results to foster the development of further methods for biomarker discovery.
△ Less
Submitted 14 March, 2023; v1 submitted 10 March, 2023;
originally announced March 2023.
-
AIROGS: Artificial Intelligence for RObust Glaucoma Screening Challenge
Authors:
Coen de Vente,
Koenraad A. Vermeer,
Nicolas Jaccard,
He Wang,
Hongyi Sun,
Firas Khader,
Daniel Truhn,
Temirgali Aimyshev,
Yerkebulan Zhanibekuly,
Tien-Dung Le,
Adrian Galdran,
Miguel Ángel González Ballester,
Gustavo Carneiro,
Devika R G,
Hrishikesh P S,
Densen Puthussery,
Hong Liu,
Zekang Yang,
Satoshi Kondo,
Satoshi Kasai,
Edward Wang,
Ashritha Durvasula,
Jónathan Heras,
Miguel Ángel Zapata,
Teresa Araújo
, et al. (11 additional authors not shown)
Abstract:
The early detection of glaucoma is essential in preventing visual impairment. Artificial intelligence (AI) can be used to analyze color fundus photographs (CFPs) in a cost-effective manner, making glaucoma screening more accessible. While AI models for glaucoma screening from CFPs have shown promising results in laboratory settings, their performance decreases significantly in real-world scenarios…
▽ More
The early detection of glaucoma is essential in preventing visual impairment. Artificial intelligence (AI) can be used to analyze color fundus photographs (CFPs) in a cost-effective manner, making glaucoma screening more accessible. While AI models for glaucoma screening from CFPs have shown promising results in laboratory settings, their performance decreases significantly in real-world scenarios due to the presence of out-of-distribution and low-quality images. To address this issue, we propose the Artificial Intelligence for Robust Glaucoma Screening (AIROGS) challenge. This challenge includes a large dataset of around 113,000 images from about 60,000 patients and 500 different screening centers, and encourages the development of algorithms that are robust to ungradable and unexpected input data. We evaluated solutions from 14 teams in this paper, and found that the best teams performed similarly to a set of 20 expert ophthalmologists and optometrists. The highest-scoring team achieved an area under the receiver operating characteristic curve of 0.99 (95% CI: 0.98-0.99) for detecting ungradable images on-the-fly. Additionally, many of the algorithms showed robust performance when tested on three other publicly available datasets. These results demonstrate the feasibility of robust AI-enabled glaucoma screening.
△ Less
Submitted 10 February, 2023; v1 submitted 3 February, 2023;
originally announced February 2023.
-
Comparative Analysis of Clustering Techniques for Personalized Food Kit Distribution
Authors:
Jude Francis,
Rowan K Baby,
Jacob Abraham,
Ajmal P. S
Abstract:
The Government of Kerala had increased the frequency of supply of free food kits owing to the pandemic, however, these items were static and not indicative of the personal preferences of the consumers. This paper conducts a comparative analysis of various clustering techniques on a scaled-down version of a real-world dataset obtained through a conjoint analysis-based survey. Clustering carried out…
▽ More
The Government of Kerala had increased the frequency of supply of free food kits owing to the pandemic, however, these items were static and not indicative of the personal preferences of the consumers. This paper conducts a comparative analysis of various clustering techniques on a scaled-down version of a real-world dataset obtained through a conjoint analysis-based survey. Clustering carried out by centroid-based methods such as k means is analyzed and the results are plotted along with SVD, and finally, a conclusion is reached as to which among the two is better. Once the clusters have been formulated, commodities are also decided upon for each cluster. Also, clustering is further enhanced by reassignment, based on a specific cluster loss threshold. Thus, the most efficacious clustering technique for designing a food kit tailored to the needs of individuals is finally obtained.
△ Less
Submitted 30 December, 2022;
originally announced December 2022.
-
Real time QKD Post Processing based on Reconfigurable Hardware Acceleration
Authors:
Foram P Shingala,
Natarajan Venkatachalam,
Selvagangai C,
Hema Priya S,
Dillibabu S,
Pooja Chandravanshi,
Ravindra P. Singh
Abstract:
Key Distillation is an essential component of every Quantum Key Distribution system because it compensates the inherent transmission errors of quantum channel. However, throughput and interoperability aspects of post-processing engine design often neglected, and exiting solutions are not providing any guarantee. In this paper, we propose multiple protocol support high throughput key distillation f…
▽ More
Key Distillation is an essential component of every Quantum Key Distribution system because it compensates the inherent transmission errors of quantum channel. However, throughput and interoperability aspects of post-processing engine design often neglected, and exiting solutions are not providing any guarantee. In this paper, we propose multiple protocol support high throughput key distillation framework implemented in a Field Programmable Gate Array (FPGA) using High-Level Synthesis (HLS). The proposed design uses a Hadoop framework with a map-reduce programming model to efficiently process large chunks of raw data across the limited computing resources of an FPGA. We present a novel hardware-efficient integrated post-processing architecture that offer dynamic error correction, a side-channel resistant authentication scheme, and an inbuilt high-speed encryption application, which uses the key for secure communication. We develop a semi automated High level synthesis framework capable of handling different QKD protocols with promising speedup. Overall, the experimental results shows that there is a significant improvement in performance and compatible with any discrete variable QKD systems.
△ Less
Submitted 30 November, 2022;
originally announced November 2022.
-
Towards Building Text-To-Speech Systems for the Next Billion Users
Authors:
Gokul Karthik Kumar,
Praveen S V,
Pratyush Kumar,
Mitesh M. Khapra,
Karthik Nandakumar
Abstract:
Deep learning based text-to-speech (TTS) systems have been evolving rapidly with advances in model architectures, training methodologies, and generalization across speakers and languages. However, these advances have not been thoroughly investigated for Indian language speech synthesis. Such investigation is computationally expensive given the number and diversity of Indian languages, relatively l…
▽ More
Deep learning based text-to-speech (TTS) systems have been evolving rapidly with advances in model architectures, training methodologies, and generalization across speakers and languages. However, these advances have not been thoroughly investigated for Indian language speech synthesis. Such investigation is computationally expensive given the number and diversity of Indian languages, relatively lower resource availability, and the diverse set of advances in neural TTS that remain untested. In this paper, we evaluate the choice of acoustic models, vocoders, supplementary loss functions, training schedules, and speaker and language diversity for Dravidian and Indo-Aryan languages. Based on this, we identify monolingual models with FastPitch and HiFi-GAN V1, trained jointly on male and female speakers to perform the best. With this setup, we train and evaluate TTS models for 13 languages and find our models to significantly improve upon existing models in all languages as measured by mean opinion scores. We open-source all models on the Bhashini platform.
△ Less
Submitted 17 February, 2023; v1 submitted 17 November, 2022;
originally announced November 2022.
-
MultiViz: A Gephi Plugin for Scalable Visualization of Multi-Layer Networks
Authors:
Jayamohan Pillai C. S.,
Ayan Chatterjee,
Geetha M.,
Amitava Mukherjee
Abstract:
The process of visually presenting networks is an effective way to understand entity relationships within the networks since it reveals the overall structure and topology of the network. Real networks are extremely difficult to visualize due to their immense complexity, which includes vast amounts of data, several types of interactions, various subsystems and several levels of connectivity as well…
▽ More
The process of visually presenting networks is an effective way to understand entity relationships within the networks since it reveals the overall structure and topology of the network. Real networks are extremely difficult to visualize due to their immense complexity, which includes vast amounts of data, several types of interactions, various subsystems and several levels of connectivity as well as changes over time. This paper introduces the "MultiViz Plugin," a plugin for gephi, an open-source software tool for graph visualization and modification, in order to to visualize complex networks in a multi-layer manner. A collection of settings are availabe through the plugin to transform an existing network into a multi-layered network. The plugin supports several layout algorithms and lets user to choose which property of the network to be used as the layer. The goal of the study is to give the user complete control over how the network is visualized in a multi-layer fashion. We demonstrate the ability of the plugin to visualize multi-layer data using a real-life complex multi-layer datasets.
△ Less
Submitted 6 September, 2022;
originally announced September 2022.
-
3DeformRS: Certifying Spatial Deformations on Point Clouds
Authors:
Gabriel Pérez S.,
Juan C. Pérez,
Motasem Alfarra,
Silvio Giancola,
Bernard Ghanem
Abstract:
3D computer vision models are commonly used in security-critical applications such as autonomous driving and surgical robotics. Emerging concerns over the robustness of these models against real-world deformations must be addressed practically and reliably. In this work, we propose 3DeformRS, a method to certify the robustness of point cloud Deep Neural Networks (DNNs) against real-world deformati…
▽ More
3D computer vision models are commonly used in security-critical applications such as autonomous driving and surgical robotics. Emerging concerns over the robustness of these models against real-world deformations must be addressed practically and reliably. In this work, we propose 3DeformRS, a method to certify the robustness of point cloud Deep Neural Networks (DNNs) against real-world deformations. We developed 3DeformRS by building upon recent work that generalized Randomized Smoothing (RS) from pixel-intensity perturbations to vector-field deformations. In particular, we specialized RS to certify DNNs against parameterized deformations (e.g. rotation, twisting), while enjoying practical computational costs. We leverage the virtues of 3DeformRS to conduct a comprehensive empirical study on the certified robustness of four representative point cloud DNNs on two datasets and against seven different deformations. Compared to previous approaches for certifying point cloud DNNs, 3DeformRS is fast, scales well with point cloud size, and provides comparable-to-better certificates. For instance, when certifying a plain PointNet against a 3° z-rotation on 1024-point clouds, 3DeformRS grants a certificate 3x larger and 20x faster than previous work.
△ Less
Submitted 12 April, 2022;
originally announced April 2022.
-
Can Open Domain Question Answering Systems Answer Visual Knowledge Questions?
Authors:
Jiawen Zhang,
Abhijit Mishra,
Avinesh P. V. S,
Siddharth Patwardhan,
Sachin Agarwal
Abstract:
The task of Outside Knowledge Visual Question Answering (OKVQA) requires an automatic system to answer natural language questions about pictures and images using external knowledge. We observe that many visual questions, which contain deictic referential phrases referring to entities in the image, can be rewritten as "non-grounded" questions and can be answered by existing text-based question answ…
▽ More
The task of Outside Knowledge Visual Question Answering (OKVQA) requires an automatic system to answer natural language questions about pictures and images using external knowledge. We observe that many visual questions, which contain deictic referential phrases referring to entities in the image, can be rewritten as "non-grounded" questions and can be answered by existing text-based question answering systems. This allows for the reuse of existing text-based Open Domain Question Answering (QA) Systems for visual question answering. In this work, we propose a potentially data-efficient approach that reuses existing systems for (a) image analysis, (b) question rewriting, and (c) text-based question answering to answer such visual questions. Given an image and a question pertaining to that image (a visual question), we first extract the entities present in the image using pre-trained object and scene classifiers. Using these detected entities, the visual questions can be rewritten so as to be answerable by open domain QA systems. We explore two rewriting strategies: (1) an unsupervised method using BERT for masking and rewriting, and (2) a weakly supervised approach that combines adaptive rewriting and reinforcement learning techniques to use the implicit feedback from the QA system. We test our strategies on the publicly available OKVQA dataset and obtain a competitive performance with state-of-the-art models while using only 10% of the training data.
△ Less
Submitted 9 February, 2022;
originally announced February 2022.
-
On chromatic parameters of some Regular graphs
Authors:
Prajnanaswaroopa S
Abstract:
In this work, we try to enunciate the Total chromatic number of some Cayley graphs like the Cayley graph on Symmetric group, Alternating group, Dihedral group with respect to some generating sets and some other regular graphs.
In this work, we try to enunciate the Total chromatic number of some Cayley graphs like the Cayley graph on Symmetric group, Alternating group, Dihedral group with respect to some generating sets and some other regular graphs.
△ Less
Submitted 21 March, 2022; v1 submitted 4 February, 2022;
originally announced February 2022.
-
Model Stability with Continuous Data Updates
Authors:
Huiting Liu,
Avinesh P. V. S.,
Siddharth Patwardhan,
Peter Grasch,
Sachin Agarwal
Abstract:
In this paper, we study the "stability" of machine learning (ML) models within the context of larger, complex NLP systems with continuous training data updates. For this study, we propose a methodology for the assessment of model stability (which we refer to as jitter under various experimental conditions. We find that model design choices, including network architecture and input representation,…
▽ More
In this paper, we study the "stability" of machine learning (ML) models within the context of larger, complex NLP systems with continuous training data updates. For this study, we propose a methodology for the assessment of model stability (which we refer to as jitter under various experimental conditions. We find that model design choices, including network architecture and input representation, have a critical impact on stability through experiments on four text classification tasks and two sequence labeling tasks. In classification tasks, non-RNN-based models are observed to be more stable than RNN-based ones, while the encoder-decoder model is less stable in sequence labeling tasks. Moreover, input representations based on pre-trained fastText embeddings contribute to more stability than other choices. We also show that two learning strategies -- ensemble models and incremental training -- have a significant influence on stability. We recommend ML model designers account for trade-offs in accuracy and jitter when making modeling choices.
△ Less
Submitted 14 January, 2022;
originally announced January 2022.
-
NL-Augmenter: A Framework for Task-Sensitive Natural Language Augmentation
Authors:
Kaustubh D. Dhole,
Varun Gangal,
Sebastian Gehrmann,
Aadesh Gupta,
Zhenhao Li,
Saad Mahamood,
Abinaya Mahendiran,
Simon Mille,
Ashish Shrivastava,
Samson Tan,
Tongshuang Wu,
Jascha Sohl-Dickstein,
Jinho D. Choi,
Eduard Hovy,
Ondrej Dusek,
Sebastian Ruder,
Sajant Anand,
Nagender Aneja,
Rabin Banjade,
Lisa Barthe,
Hanna Behnke,
Ian Berlot-Attwell,
Connor Boyle,
Caroline Brun,
Marco Antonio Sobrevilla Cabezudo
, et al. (101 additional authors not shown)
Abstract:
Data augmentation is an important component in the robustness evaluation of models in natural language processing (NLP) and in enhancing the diversity of the data they are trained on. In this paper, we present NL-Augmenter, a new participatory Python-based natural language augmentation framework which supports the creation of both transformations (modifications to the data) and filters (data split…
▽ More
Data augmentation is an important component in the robustness evaluation of models in natural language processing (NLP) and in enhancing the diversity of the data they are trained on. In this paper, we present NL-Augmenter, a new participatory Python-based natural language augmentation framework which supports the creation of both transformations (modifications to the data) and filters (data splits according to specific features). We describe the framework and an initial set of 117 transformations and 23 filters for a variety of natural language tasks. We demonstrate the efficacy of NL-Augmenter by using several of its transformations to analyze the robustness of popular natural language models. The infrastructure, datacards and robustness analysis results are available publicly on the NL-Augmenter repository (https://github.com/GEM-benchmark/NL-Augmenter).
△ Less
Submitted 11 October, 2022; v1 submitted 5 December, 2021;
originally announced December 2021.
-
Security Monitoring System Using FaceNet For Wireless Sensor Network
Authors:
Preetha S,
Sheela S V
Abstract:
Wireless Sensor networks are used to monitor remote areas. Wireless sensor network can be applied to monitor a facility by considering each camera as sensor nodes. Cameras are used as nodes in a wireless sensor network with a central server or a gateway node for all the monitoring and analysis of the information retrieved from the nodes. Identification and authentication of users in any organizati…
▽ More
Wireless Sensor networks are used to monitor remote areas. Wireless sensor network can be applied to monitor a facility by considering each camera as sensor nodes. Cameras are used as nodes in a wireless sensor network with a central server or a gateway node for all the monitoring and analysis of the information retrieved from the nodes. Identification and authentication of users in any organization is quite difficult due to high movement. Face recognition can be used detect faces and identify them continuously in a video feed which can be deployed to continuously monitor an area. Feeding from camera to base station uses Multi-task Cascaded Convolutional Neural Networks (MCTNN) and FaceNet algorithms for face recognition. Further information about the person is sent to all the end-user nodes present in the wireless network. This approach has been implemented and evaluated on a prototype wired camera network called FaceNet. A method for tracking people in 2D world coordinates and acquiring canonical frontal face images that fits the sensor network paradigm. The approach evaluates and demonstrates the tasking algorithm in action on data acquired from the FaceNet camera network. In this paper, face recognition algorithm FaceNet is used to implement security monitoring network
△ Less
Submitted 2 December, 2021;
originally announced December 2021.
-
Scalable Machine Learning Architecture for Neonatal Seizure Detection on Ultra-Edge Devices
Authors:
Vishal Nagarajan,
Ashwini Muralidharan,
Deekshitha Sriraman,
Pravin Kumar S
Abstract:
Neonatal seizures are a commonly encountered neurological condition. They are the first clinical signs of a serious neurological disorder. Thus, rapid recognition and treatment are necessary to prevent serious fatalities. The use of electroencephalography (EEG) in the field of neurology allows precise diagnosis of several medical conditions. However, interpreting EEG signals needs the attention of…
▽ More
Neonatal seizures are a commonly encountered neurological condition. They are the first clinical signs of a serious neurological disorder. Thus, rapid recognition and treatment are necessary to prevent serious fatalities. The use of electroencephalography (EEG) in the field of neurology allows precise diagnosis of several medical conditions. However, interpreting EEG signals needs the attention of highly specialized staff since the infant brain is developmentally immature during the neonatal period. Detecting seizures on time could potentially prevent the negative effects on the neurocognitive development of the infants. In recent years, neonatal seizure detection using machine learning algorithms have been gaining traction. Since there is a need for the classification of bio-signals to be computationally inexpensive in the case of seizure detection, this research presents a machine learning (ML) based architecture that operates with comparable predictive performance as previous models but with minimum level configuration. The proposed classifier was trained and tested on a public dataset of NICU seizures recorded at the Helsinki University Hospital. Our architecture achieved a best sensitivity of 87%, which is 6% more than that of the standard ML model chosen in this study. The model size of the ML classifier is optimized to just 4.84 KB with minimum prediction time of 182.61 milliseconds, thus enabling it to be deployed on wearable ultra-edge devices for quick and accurate response and obviating the need for cloud-based and other such exhaustive computational methods.
△ Less
Submitted 29 November, 2021;
originally announced November 2021.