Search | arXiv e-print repository

UQLM: A Python Package for Uncertainty Quantification in Large Language Models

Authors: Dylan Bouchard, Mohit Singh Chauhan, David Skarbrevik, Ho-Kyeong Ra, Viren Bajaj, Zeya Ahmad

Abstract: Hallucinations, defined as instances where Large Language Models (LLMs) generate false or misleading content, pose a significant challenge that impacts the safety and trust of downstream applications. We introduce UQLM, a Python package for LLM hallucination detection using state-of-the-art uncertainty quantification (UQ) techniques. This toolkit offers a suite of UQ-based scorers that compute res… ▽ More Hallucinations, defined as instances where Large Language Models (LLMs) generate false or misleading content, pose a significant challenge that impacts the safety and trust of downstream applications. We introduce UQLM, a Python package for LLM hallucination detection using state-of-the-art uncertainty quantification (UQ) techniques. This toolkit offers a suite of UQ-based scorers that compute response-level confidence scores ranging from 0 to 1. This library provides an off-the-shelf solution for UQ-based hallucination detection that can be easily integrated to enhance the reliability of LLM outputs. △ Less

Submitted 8 July, 2025; originally announced July 2025.

Comments: Submitted to Journal of Machine Learning Research (MLOSS); UQLM Repository: https://github.com/cvs-health/uqlm

arXiv:2506.21298 [pdf, ps, other]

Exploring Adapter Design Tradeoffs for Low Resource Music Generation

Authors: Atharva Mehta, Shivam Chauhan, Monojit Choudhury

Abstract: Fine-tuning large-scale music generation models, such as MusicGen and Mustango, is a computationally expensive process, often requiring updates to billions of parameters and, therefore, significant hardware resources. Parameter-Efficient Fine-Tuning (PEFT) techniques, particularly adapter-based methods, have emerged as a promising alternative, enabling adaptation with minimal trainable parameters… ▽ More Fine-tuning large-scale music generation models, such as MusicGen and Mustango, is a computationally expensive process, often requiring updates to billions of parameters and, therefore, significant hardware resources. Parameter-Efficient Fine-Tuning (PEFT) techniques, particularly adapter-based methods, have emerged as a promising alternative, enabling adaptation with minimal trainable parameters while preserving model performance. However, the design choices for adapters, including their architecture, placement, and size, are numerous, and it is unclear which of these combinations would produce optimal adapters and why, for a given case of low-resource music genre. In this paper, we attempt to answer this question by studying various adapter configurations for two AI music models, MusicGen and Mustango, on two genres: Hindustani Classical and Turkish Makam music. Our findings reveal distinct trade-offs: convolution-based adapters excel in capturing fine-grained local musical details such as ornamentations and short melodic phrases, while transformer-based adapters better preserve long-range dependencies crucial for structured improvisation. Additionally, we analyze computational resource requirements across different adapter scales, demonstrating how mid-sized adapters (40M parameters) achieve an optimal balance between expressivity and quality. Furthermore, we find that Mustango, a diffusion-based model, generates more diverse outputs with better adherence to the description in the input prompt while lacking in providing stability in notes, rhythm alignment, and aesthetics. Also, it is computationally intensive and requires significantly more time to train. In contrast, autoregressive models like MusicGen offer faster training and are more efficient, and can produce better quality output in comparison, but have slightly higher redundancy in their generations. △ Less

Submitted 26 June, 2025; originally announced June 2025.

Comments: 9 pages, 5 figures

arXiv:2506.11973 [pdf, ps, other]

Self-Regulating Cars: Automating Traffic Control in Free Flow Road Networks

Authors: Ankit Bhardwaj, Rohail Asim, Sachin Chauhan, Yasir Zaki, Lakshminarayanan Subramanian

Abstract: Free-flow road networks, such as suburban highways, are increasingly experiencing traffic congestion due to growing commuter inflow and limited infrastructure. Traditional control mechanisms, such as traffic signals or local heuristics, are ineffective or infeasible in these high-speed, signal-free environments. We introduce self-regulating cars, a reinforcement learning-based traffic control prot… ▽ More Free-flow road networks, such as suburban highways, are increasingly experiencing traffic congestion due to growing commuter inflow and limited infrastructure. Traditional control mechanisms, such as traffic signals or local heuristics, are ineffective or infeasible in these high-speed, signal-free environments. We introduce self-regulating cars, a reinforcement learning-based traffic control protocol that dynamically modulates vehicle speeds to optimize throughput and prevent congestion, without requiring new physical infrastructure. Our approach integrates classical traffic flow theory, gap acceptance models, and microscopic simulation into a physics-informed RL framework. By abstracting roads into super-segments, the agent captures emergent flow dynamics and learns robust speed modulation policies from instantaneous traffic observations. Evaluated in the high-fidelity PTV Vissim simulator on a real-world highway network, our method improves total throughput by 5%, reduces average delay by 13%, and decreases total stops by 3% compared to the no-control setting. It also achieves smoother, congestion-resistant flow while generalizing across varied traffic patterns, demonstrating its potential for scalable, ML-driven traffic management. △ Less

Submitted 13 June, 2025; originally announced June 2025.

arXiv:2504.19254 [pdf, other]

Uncertainty Quantification for Language Models: A Suite of Black-Box, White-Box, LLM Judge, and Ensemble Scorers

Authors: Dylan Bouchard, Mohit Singh Chauhan

Abstract: Hallucinations are a persistent problem with Large Language Models (LLMs). As these models become increasingly used in high-stakes domains, such as healthcare and finance, the need for effective hallucination detection is crucial. To this end, we propose a versatile framework for zero-resource hallucination detection that practitioners can apply to real-world use cases. To achieve this, we adapt a… ▽ More Hallucinations are a persistent problem with Large Language Models (LLMs). As these models become increasingly used in high-stakes domains, such as healthcare and finance, the need for effective hallucination detection is crucial. To this end, we propose a versatile framework for zero-resource hallucination detection that practitioners can apply to real-world use cases. To achieve this, we adapt a variety of existing uncertainty quantification (UQ) techniques, including black-box UQ, white-box UQ, and LLM-as-a-Judge, transforming them as necessary into standardized response-level confidence scores ranging from 0 to 1. To enhance flexibility, we introduce a tunable ensemble approach that incorporates any combination of the individual confidence scores. This approach enables practitioners to optimize the ensemble for a specific use case for improved performance. To streamline implementation, the full suite of scorers is offered in this paper's companion Python toolkit, UQLM. To evaluate the performance of the various scorers, we conduct an extensive set of experiments using several LLM question-answering benchmarks. We find that our tunable ensemble typically surpasses its individual components and outperforms existing hallucination detection methods. Our results demonstrate the benefits of customized hallucination detection strategies for improving the accuracy and reliability of LLMs. △ Less

Submitted 30 April, 2025; v1 submitted 27 April, 2025; originally announced April 2025.

Comments: UQLM repository: https://github.com/cvs-health/uqlm

arXiv:2504.06011 [pdf, other]

Llama-3-Nanda-10B-Chat: An Open Generative Large Language Model for Hindi

Authors: Monojit Choudhury, Shivam Chauhan, Rocktim Jyoti Das, Dhruv Sahnan, Xudong Han, Haonan Li, Aaryamonvikram Singh, Alok Anil Jadhav, Utkarsh Agarwal, Mukund Choudhary, Debopriyo Banerjee, Fajri Koto, Junaid Bhat, Awantika Shukla, Samujjwal Ghosh, Samta Kamboj, Onkar Pandit, Lalit Pradhan, Rahul Pal, Sunil Sahu, Soundar Doraiswamy, Parvez Mullah, Ali El Filali, Neha Sengupta, Gokul Ramakrishnan , et al. (5 additional authors not shown)

Abstract: Developing high-quality large language models (LLMs) for moderately resourced languages presents unique challenges in data availability, model adaptation, and evaluation. We introduce Llama-3-Nanda-10B-Chat, or Nanda for short, a state-of-the-art Hindi-centric instruction-tuned generative LLM, designed to push the boundaries of open-source Hindi language models. Built upon Llama-3-8B, Nanda incorp… ▽ More Developing high-quality large language models (LLMs) for moderately resourced languages presents unique challenges in data availability, model adaptation, and evaluation. We introduce Llama-3-Nanda-10B-Chat, or Nanda for short, a state-of-the-art Hindi-centric instruction-tuned generative LLM, designed to push the boundaries of open-source Hindi language models. Built upon Llama-3-8B, Nanda incorporates continuous pre-training with expanded transformer blocks, leveraging the Llama Pro methodology. A key challenge was the limited availability of high-quality Hindi text data; we addressed this through rigorous data curation, augmentation, and strategic bilingual training, balancing Hindi and English corpora to optimize cross-linguistic knowledge transfer. With 10 billion parameters, Nanda stands among the top-performing open-source Hindi and multilingual models of similar scale, demonstrating significant advantages over many existing models. We provide an in-depth discussion of training strategies, fine-tuning techniques, safety alignment, and evaluation metrics, demonstrating how these approaches enabled Nanda to achieve state-of-the-art results. By open-sourcing Nanda, we aim to advance research in Hindi LLMs and support a wide range of real-world applications across academia, industry, and public services. △ Less

Submitted 8 April, 2025; originally announced April 2025.

arXiv:2502.09766 [pdf, other]

LLM-Generated Microservice Implementations from RESTful API Definitions

Authors: Saurabh Chauhan, Zeeshan Rasheed, Abdul Malik Sami, Zheying Zhang, Jussi Rasku, Kai-Kristian Kemell, Pekka Abrahamsson

Abstract: The growing need for scalable, maintainable, and fast-deploying systems has made microservice architecture widely popular in software development. This paper presents a system that uses Large Language Models (LLMs) to automate the API-first development of RESTful microservices. This system assists in creating OpenAPI specification, generating server code from it, and refining the code through a fe… ▽ More The growing need for scalable, maintainable, and fast-deploying systems has made microservice architecture widely popular in software development. This paper presents a system that uses Large Language Models (LLMs) to automate the API-first development of RESTful microservices. This system assists in creating OpenAPI specification, generating server code from it, and refining the code through a feedback loop that analyzes execution logs and error messages. By focusing on the API-first methodology, this system ensures that microservices are designed with well-defined interfaces, promoting consistency and reliability across the development life-cycle. The integration of log analysis enables the LLM to detect and address issues efficiently, reducing the number of iterations required to produce functional and robust services. This process automates the generation of microservices and also simplifies the debugging and refinement phases, allowing developers to focus on higher-level design and integration tasks. This system has the potential to benefit software developers, architects, and organizations to speed up software development cycles and reducing manual effort. To assess the potential of the system, we conducted surveys with six industry practitioners. After surveying practitioners, the system demonstrated notable advantages in enhancing development speed, automating repetitive tasks, and simplifying the prototyping process. While experienced developers appreciated its efficiency for specific tasks, some expressed concerns about its limitations in handling advanced customizations and larger scale projects. The code is publicly available at https://github.com/sirbh/code-gen △ Less

Submitted 13 February, 2025; originally announced February 2025.

arXiv:2502.08689 [pdf]

Advancing machine fault diagnosis: A detailed examination of convolutional neural networks

Authors: Govind Vashishtha, Sumika Chauhan, Mert Sehri, Justyna Hebda-Sobkowicz, Radoslaw Zimroz, Patrick Dumond, Rajesh Kumar

Abstract: The growing complexity of machinery and the increasing demand for operational efficiency and safety have driven the development of advanced fault diagnosis techniques. Among these, convolutional neural networks (CNNs) have emerged as a powerful tool, offering robust and accurate fault detection and classification capabilities. This comprehensive review delves into the application of CNNs in machin… ▽ More The growing complexity of machinery and the increasing demand for operational efficiency and safety have driven the development of advanced fault diagnosis techniques. Among these, convolutional neural networks (CNNs) have emerged as a powerful tool, offering robust and accurate fault detection and classification capabilities. This comprehensive review delves into the application of CNNs in machine fault diagnosis, covering its theoretical foundation, architectural variations, and practical implementations. The strengths and limitations of CNNs are analyzed in this domain, discussing their effectiveness in handling various fault types, data complexities, and operational environments. Furthermore, we explore the evolving landscape of CNN-based fault diagnosis, examining recent advancements in data augmentation, transfer learning, and hybrid architectures. Finally, we highlight future research directions and potential challenges to further enhance the application of CNNs for reliable and proactive machine fault diagnosis. △ Less

Submitted 12 February, 2025; originally announced February 2025.

arXiv:2502.07391 [pdf, other]

Target-Augmented Shared Fusion-based Multimodal Sarcasm Explanation Generation

Authors: Palaash Goel, Dushyant Singh Chauhan, Md Shad Akhtar

Abstract: Sarcasm is a linguistic phenomenon that intends to ridicule a target (e.g., entity, event, or person) in an inherent way. Multimodal Sarcasm Explanation (MuSE) aims at revealing the intended irony in a sarcastic post using a natural language explanation. Though important, existing systems overlooked the significance of the target of sarcasm in generating explanations. In this paper, we propose a T… ▽ More Sarcasm is a linguistic phenomenon that intends to ridicule a target (e.g., entity, event, or person) in an inherent way. Multimodal Sarcasm Explanation (MuSE) aims at revealing the intended irony in a sarcastic post using a natural language explanation. Though important, existing systems overlooked the significance of the target of sarcasm in generating explanations. In this paper, we propose a Target-aUgmented shaRed fusion-Based sarcasm explanatiOn model, aka. TURBO. We design a novel shared-fusion mechanism to leverage the inter-modality relationships between an image and its caption. TURBO assumes the target of the sarcasm and guides the multimodal shared fusion mechanism in learning intricacies of the intended irony for explanations. We evaluate our proposed TURBO model on the MORE+ dataset. Comparison against multiple baselines and state-of-the-art models signifies the performance improvement of TURBO by an average margin of $+3.3\%$. Moreover, we explore LLMs in zero and one-shot settings for our task and observe that LLM-generated explanation, though remarkable, often fails to capture the critical nuances of the sarcasm. Furthermore, we supplement our study with extensive human evaluation on TURBO's generated explanations and find them out to be comparatively better than other systems. △ Less

Submitted 11 February, 2025; originally announced February 2025.

arXiv:2502.07328 [pdf, other]

Music for All: Representational Bias and Cross-Cultural Adaptability of Music Generation Models

Authors: Atharva Mehta, Shivam Chauhan, Amirbek Djanibekov, Atharva Kulkarni, Gus Xia, Monojit Choudhury

Abstract: The advent of Music-Language Models has greatly enhanced the automatic music generation capability of AI systems, but they are also limited in their coverage of the musical genres and cultures of the world. We present a study of the datasets and research papers for music generation and quantify the bias and under-representation of genres. We find that only 5.7% of the total hours of existing music… ▽ More The advent of Music-Language Models has greatly enhanced the automatic music generation capability of AI systems, but they are also limited in their coverage of the musical genres and cultures of the world. We present a study of the datasets and research papers for music generation and quantify the bias and under-representation of genres. We find that only 5.7% of the total hours of existing music datasets come from non-Western genres, which naturally leads to disparate performance of the models across genres. We then investigate the efficacy of Parameter-Efficient Fine-Tuning (PEFT) techniques in mitigating this bias. Our experiments with two popular models -- MusicGen and Mustango, for two underrepresented non-Western music traditions -- Hindustani Classical and Turkish Makam music, highlight the promises as well as the non-triviality of cross-genre adaptation of music through small datasets, implying the need for more equitable baseline music-language models that are designed for cross-cultural transfer learning. △ Less

Submitted 6 May, 2025; v1 submitted 11 February, 2025; originally announced February 2025.

Comments: 17 pages, 5 figures, accepted to NAACL'25

arXiv:2501.03112 [pdf, other]

doi 10.21105/joss.07570

LangFair: A Python Package for Assessing Bias and Fairness in Large Language Model Use Cases

Authors: Dylan Bouchard, Mohit Singh Chauhan, David Skarbrevik, Viren Bajaj, Zeya Ahmad

Abstract: Large Language Models (LLMs) have been observed to exhibit bias in numerous ways, potentially creating or worsening outcomes for specific groups identified by protected attributes such as sex, race, sexual orientation, or age. To help address this gap, we introduce LangFair, an open-source Python package that aims to equip LLM practitioners with the tools to evaluate bias and fairness risks releva… ▽ More Large Language Models (LLMs) have been observed to exhibit bias in numerous ways, potentially creating or worsening outcomes for specific groups identified by protected attributes such as sex, race, sexual orientation, or age. To help address this gap, we introduce LangFair, an open-source Python package that aims to equip LLM practitioners with the tools to evaluate bias and fairness risks relevant to their specific use cases. The package offers functionality to easily generate evaluation datasets, comprised of LLM responses to use-case-specific prompts, and subsequently calculate applicable metrics for the practitioner's use case. To guide in metric selection, LangFair offers an actionable decision framework. △ Less

Submitted 6 January, 2025; originally announced January 2025.

Comments: Journal of Open Source Software; LangFair repository: https://github.com/cvs-health/langfair

Journal ref: Journal of Open Source Software, 10(105), 7570 (2025)

arXiv:2412.04100 [pdf, other]

Missing Melodies: AI Music Generation and its "Nearly" Complete Omission of the Global South

Authors: Atharva Mehta, Shivam Chauhan, Monojit Choudhury

Abstract: Recent advances in generative AI have sparked renewed interest and expanded possibilities for music generation. However, the performance and versatility of these systems across musical genres are heavily influenced by the availability of training data. We conducted an extensive analysis of over one million hours of audio datasets used in AI music generation research and manually reviewed more than… ▽ More Recent advances in generative AI have sparked renewed interest and expanded possibilities for music generation. However, the performance and versatility of these systems across musical genres are heavily influenced by the availability of training data. We conducted an extensive analysis of over one million hours of audio datasets used in AI music generation research and manually reviewed more than 200 papers from eleven prominent AI and music conferences and organizations (AAAI, ACM, EUSIPCO, EURASIP, ICASSP, ICML, IJCAI, ISMIR, NeurIPS, NIME, SMC) to identify a critical gap in the fair representation and inclusion of the musical genres of the Global South in AI research. Our findings reveal a stark imbalance: approximately 86% of the total dataset hours and over 93% of researchers focus primarily on music from the Global North. However, around 40% of these datasets include some form of non-Western music, genres from the Global South account for only 14.6% of the data. Furthermore, approximately 51% of the papers surveyed concentrate on symbolic music generation, a method that often fails to capture the cultural nuances inherent in music from regions such as South Asia, the Middle East, and Africa. As AI increasingly shapes the creation and dissemination of music, the significant underrepresentation of music genres in datasets and research presents a serious threat to global musical diversity. We also propose some important steps to mitigate these risks and foster a more inclusive future for AI-driven music generation. △ Less

Submitted 12 December, 2024; v1 submitted 5 December, 2024; originally announced December 2024.

Comments: Submitted to CACM, 12 pages, 2 figures

arXiv:2411.19497 [pdf, other]

SANGO: Socially Aware Navigation through Grouped Obstacles

Authors: Rahath Malladi, Amol Harsh, Arshia Sangwan, Sunita Chauhan, Sandeep Manjanna

Abstract: This paper introduces SANGO (Socially Aware Navigation through Grouped Obstacles), a novel method that ensures socially appropriate behavior by dynamically grouping obstacles and adhering to social norms. Using deep reinforcement learning, SANGO trains agents to navigate complex environments leveraging the DBSCAN algorithm for obstacle clustering and Proximal Policy Optimization (PPO) for path pla… ▽ More This paper introduces SANGO (Socially Aware Navigation through Grouped Obstacles), a novel method that ensures socially appropriate behavior by dynamically grouping obstacles and adhering to social norms. Using deep reinforcement learning, SANGO trains agents to navigate complex environments leveraging the DBSCAN algorithm for obstacle clustering and Proximal Policy Optimization (PPO) for path planning. The proposed approach improves safety and social compliance by maintaining appropriate distances and reducing collision rates. Extensive experiments conducted in custom simulation environments demonstrate SANGO's superior performance in significantly reducing discomfort (by up to 83.5%), reducing collision rates (by up to 29.4%) and achieving higher successful navigation in dynamic and crowded scenarios. These findings highlight the potential of SANGO for real-world applications, paving the way for advanced socially adept robotic navigation systems. △ Less

Submitted 29 November, 2024; originally announced November 2024.

Comments: Indian Control Conference 2024 (ICC-10)

arXiv:2411.06970 [pdf]

Enhancing Accessibility in Special Libraries: A Study on AI-Powered Assistive Technologies for Patrons with Disabilities

Authors: Snehasish Paul Shivali Chauhan

Abstract: This study seeks to identify the potential role of AI-driven assistive technologies in enhancing access to libraries for persons with varying degrees of challenges. Traditional libraries pose a problem to many users with vision and mobility, among other conditions related to physical and infirmities. This mixed-methods research approach will examine ways in which AI-powered assistive tools and app… ▽ More This study seeks to identify the potential role of AI-driven assistive technologies in enhancing access to libraries for persons with varying degrees of challenges. Traditional libraries pose a problem to many users with vision and mobility, among other conditions related to physical and infirmities. This mixed-methods research approach will examine ways in which AI-powered assistive tools and applications associated with text-to-speech, navigation systems, and personalized assistants are revolutionizing library services through a literature review, survey methods, interviews, and case studies. Our findings suggest that these technologies greatly increase the autonomy and participation of people with physical disabilities, providing personalized support and access to a wide range of resources. From this, some key findings have been deduced from the research, showing a strong impact on user experience and efficiency in services, while at the same time bringing out important considerations related to privacy and ethical implementation. This study highlights the central role of AI in making library settings more inclusive, thereby allowing equal access to knowledge and participation in the community. Such insight thus serves professionals working in libraries, policymakers, and technology developers for innovations to occur uninterruptedly, with future research directions proposed that would refine such technologies, especially toward the special needs of diverse populations. By adopting AI, libraries could uphold their mission of providing equal access to knowledge through full and equal participation of all persons, regardless of any type of physical ability, in the learning and community activities carried out by the library. This study paves the way for future innovations in creating more accessible and inclusive library spaces. △ Less

Submitted 11 November, 2024; originally announced November 2024.

arXiv:2411.02993 [pdf]

Empowering Library Users: Creative Strategies for Engagement and Innovation

Authors: Snehasish Paul, Shivali Chauhan, Atul Kumar Pal

Abstract: This study investigated the integration of cutting-edge technologies and methodologies for creating dynamic, user-centered library environments. In creative strategies for engagement and innovation, library users must be empowered to undertake the new role of modernizing library services and enhancing user experiences. It also enhances the information management and user engagement. This can be at… ▽ More This study investigated the integration of cutting-edge technologies and methodologies for creating dynamic, user-centered library environments. In creative strategies for engagement and innovation, library users must be empowered to undertake the new role of modernizing library services and enhancing user experiences. It also enhances the information management and user engagement. This can be attained from personalized approaches, such as recommendation systems to interactive platforms that will have effective experiences tailored to users of different natures. It investigates the consumer engagement practices of enthusiasm, sharing, and learning about their roles in cognitive, affective, and behavioural engagements. Combined, these new approaches will help promote learning, interaction, and growth, add value, and have a more positive impact on users. The challenge for libraries in this rapidly changing, technologically advancing, and digitally networked world, with a base of expectant users, is to remain relevant and engaging. This study discusses innovative strategies for empowering library users and enhancing their engagement through creative and technological approaches. This investigation was conducted to integrate cutting-edge technologies and methodologies into creating dynamic library settings that are user-centered and foster learning, interaction, and personal growth. △ Less

Submitted 5 November, 2024; originally announced November 2024.

arXiv:2410.14815 [pdf, other]

Adapting Multilingual LLMs to Low-Resource Languages using Continued Pre-training and Synthetic Corpus

Authors: Raviraj Joshi, Kanishk Singla, Anusha Kamath, Raunak Kalani, Rakesh Paul, Utkarsh Vaidya, Sanjay Singh Chauhan, Niranjan Wartikar, Eileen Long

Abstract: Multilingual LLMs support a variety of languages; however, their performance is suboptimal for low-resource languages. In this work, we emphasize the importance of continued pre-training of multilingual LLMs and the use of translation-based synthetic pre-training corpora for improving LLMs in low-resource languages. We conduct our study in the context of the low-resource Indic language Hindi. We i… ▽ More Multilingual LLMs support a variety of languages; however, their performance is suboptimal for low-resource languages. In this work, we emphasize the importance of continued pre-training of multilingual LLMs and the use of translation-based synthetic pre-training corpora for improving LLMs in low-resource languages. We conduct our study in the context of the low-resource Indic language Hindi. We introduce Nemotron-Mini-Hindi 4B, a bilingual SLM supporting both Hindi and English, based on Nemotron-Mini 4B. The model is trained using a mix of real and synthetic Hindi + English tokens, with continuous pre-training performed on 400B tokens. We demonstrate that both the base and instruct models achieve state-of-the-art results on Hindi benchmarks while remaining competitive on English tasks. Additionally, we observe that the continued pre-training approach enhances the model's overall factual accuracy. We perform an ablation study to highlight the impact of Hindi pre-training, showing significant improvements in Hindi chat capabilities and factual accuracy, which cannot be achieved through Hindi alignment alone. △ Less

Submitted 21 April, 2025; v1 submitted 18 October, 2024; originally announced October 2024.

arXiv:2410.11091 [pdf, other]

Energy-Efficient Cryogenic Ternary Content Addressable Memory using Ferroelectric SQUID

Authors: Shamiul Alam, Simon Thomann, Shivendra Singh Parihar, Yogesh Singh Chauhan, Kai Ni, Hussam Amrouch, Ahmedullah Aziz

Abstract: Ternary content addressable memories (TCAMs) are useful for certain computing tasks since they allow us to compare a search query with a whole dataset stored in the memory array. They can also unlock unique advantages for cryogenic applications like quantum computing, high-performance computing, and space exploration by improving speed and energy efficiency through parallel searching. This paper e… ▽ More Ternary content addressable memories (TCAMs) are useful for certain computing tasks since they allow us to compare a search query with a whole dataset stored in the memory array. They can also unlock unique advantages for cryogenic applications like quantum computing, high-performance computing, and space exploration by improving speed and energy efficiency through parallel searching. This paper explores the design and implementation of a cryogenic ternary content addressable memory based on ferroelectric superconducting quantum interference devices (FeSQUIDs). The use of FeSQUID for designing the TCAM provides several unique advantages. First, we can get binary decisions (zero or non-zero voltage) for matching and mismatching conditions without using any peripheral circuitry. Moreover, the proposed TCAM needs ultra-low energy (1.36 aJ and 26.5 aJ average energy consumption for 1-bit binary and ternary search, respectively), thanks to the use of energy-efficient SQUIDs. Finally, we show the efficiency of FeSQUID through the brain-inspired application of Hyperdimensional Computing (HDC). Here, the FeSQUID-based TCAM implements the associative memory to support the highly parallel search needed in the inference step. We estimate an energy consumption of 89.4 fJ per vector comparison using a vector size of 10,000 bits. We also compare the FeSQUID-based TCAM array with the 5nm FinFET-based cryogenic SRAM-based TCAM array and observe that the proposed FeSQUID-based TCAM array consumes over one order of magnitude lower energy while performing the same task. △ Less

Submitted 14 October, 2024; originally announced October 2024.

Comments: 6 figures

arXiv:2406.12405 [pdf]

On The Effective Rate and Error Rate Analysis over Fluctuating Nakagami-m Fading Channel

Authors: Manpreet Kaur, Puspraj Singh Chauhan, Sandeep Kumar, Pappu Kumar Verma

Abstract: This paper provides a detailed analysis of the important performance metrics like effective capacity and symbol error rate over fluctuating Nakagami-m fading channel. This distribution is obtained from the ratio of two random variables, following the Nakagami-m distribution and the uniform distribution. Our study derives exact analytical expressions for the EC and SER under different modulation sc… ▽ More This paper provides a detailed analysis of the important performance metrics like effective capacity and symbol error rate over fluctuating Nakagami-m fading channel. This distribution is obtained from the ratio of two random variables, following the Nakagami-m distribution and the uniform distribution. Our study derives exact analytical expressions for the EC and SER under different modulation schemes, considering the effect of channel parameters. Recognising the importance of additive Laplacian noise in today scenario, it has been considered for the error performance analysis of the system. This work may be utilised for the design and optimization of the systems operating in environments characterized by fluctuating Nakagami-m fading. △ Less

Submitted 18 June, 2024; originally announced June 2024.

Comments: 18 pages

arXiv:2405.01600 [pdf, ps, other]

Improved and Explainable Cervical Cancer Classification using Ensemble Pooling of Block Fused Descriptors

Authors: Saurabh Saini, Kapil Ahuja, Akshat S. Chauhan

Abstract: Cervical cancer is the second most common cancer in women and causes high death rates. Earlier models for detecting cervical cancer had limited success. In this work, we propose new models that substantially outperform previous models. Previous studies show that pretrained ResNets extract features from cervical cancer images well. Hence, our first model involves working with three ResNets (50, 1… ▽ More Cervical cancer is the second most common cancer in women and causes high death rates. Earlier models for detecting cervical cancer had limited success. In this work, we propose new models that substantially outperform previous models. Previous studies show that pretrained ResNets extract features from cervical cancer images well. Hence, our first model involves working with three ResNets (50, 101, 152). All the existing works use only the last convolution block of their respective ResNet, which captures abstract features (e.g., shapes, objects). However, we believe that detailed features (e.g., color, edges, texture), coming from earlier convolution blocks, are equally important for cancer (specifically cervical cancer) classification. Since now the number of features become large, we use a novel feature selection technique of Global Max Pooling for detailed features and Global Average Pooling for abstract features. Hence, our second model consists of the resulting Cascaded Block Fused variants of the three ResNets. To improve the performance further, we combine and normalize the features of the three standard ResNets as well as our proposed three Cascaded Block Fused ResNets. This type of combination is also new in cancer classification domain (also in cervical cancer), and results in our third and fourth models, respectively. We use a linear SVM for classification. We exhaustively perform experiments on two public datasets, IARC and AnnoCerv, achieving an average performance of 97.92% and 92.97% surpassing standard ResNets performance of 90.89% and 87.97%, respectively. We outperform the competitive approach available on IARC dataset with an average gain of 13.20%, while no prior competitive work available on AnnoCerv. Additionally, we introduce a novel SHAP+LIME explainability method, accurately identifying the cancerous region in 97% of cases. △ Less

Submitted 24 June, 2025; v1 submitted 1 May, 2024; originally announced May 2024.

Comments: 26 Pages, 10 figures, and 8 tables

ACM Class: I.2.1; I.5.2

arXiv:2405.00004 [pdf, other]

Self-healing Nodes with Adaptive Data-Sharding

Authors: Ayush Thakur, Sanskar Chauhan, Ilisha Tomar, Vaibhavi Paul, Deepak Gupta

Abstract: Data sharding, a technique for partitioning and distributing data among multiple servers or nodes, offers enhancements in the scalability, performance, and fault tolerance of extensive distributed systems. Nonetheless, this strategy introduces novel challenges, including load balancing among shards, management of node failures and data loss, and adaptation to evolving data and workload patterns. T… ▽ More Data sharding, a technique for partitioning and distributing data among multiple servers or nodes, offers enhancements in the scalability, performance, and fault tolerance of extensive distributed systems. Nonetheless, this strategy introduces novel challenges, including load balancing among shards, management of node failures and data loss, and adaptation to evolving data and workload patterns. This paper proposes an innovative approach to tackle these challenges by empowering self-healing nodes with adaptive data sharding. Leveraging concepts such as self-replication, fractal regeneration, sentient data sharding, and symbiotic node clusters, our approach establishes a dynamic and resilient data sharding scheme capable of addressing diverse scenarios and meeting varied requirements. Implementation and evaluation of our approach involve a prototype system simulating a large-scale distributed database across various data sharding scenarios. Comparative analyses against existing data sharding techniques highlight the superior scalability, performance, fault tolerance, and adaptability of our approach. Additionally, the paper delves into potential applications and limitations, providing insights into the future research directions that can further advance this innovative approach. △ Less

Submitted 19 January, 2024; originally announced May 2024.

arXiv:2404.16839 [pdf]

Immersed in Reality Secured by Design -- A Comprehensive Analysis of Security Measures in AR/VR Environments

Authors: Sameer Chauhan, Luv Sachdeva

Abstract: Virtual reality and related technologies such as mixed and augmented reality have received extensive coverage in both mainstream and fringe media outlets. When the subject goes to a new AR headset, another AR device, or AR glasses, the talk swiftly shifts to the technical and design details. Unfortunately, no one seemed to care about security. Data theft and other forms of cyberattack pose serious… ▽ More Virtual reality and related technologies such as mixed and augmented reality have received extensive coverage in both mainstream and fringe media outlets. When the subject goes to a new AR headset, another AR device, or AR glasses, the talk swiftly shifts to the technical and design details. Unfortunately, no one seemed to care about security. Data theft and other forms of cyberattack pose serious threats to virtual reality systems. Virtual reality goggles are just specialist versions of computers or Internet of Things devices, whereas virtual reality experiences are software packages. As a result, AR systems are just as vulnerable as any other Internet of Things (IoT) device we use on a daily basis, such as computers, tablets, and phones. Preventing and responding to common cybersecurity threats and assaults is crucial. Cybercriminals can exploit virtual reality headsets just like any other computer system. This paper analysis the data breach induced by these assaults could result in a variety of concerns, including but not limited to identity theft, the unauthorized acquisition of personal information or network credentials, damage to hardware and software, and so on. Augmented reality (AR) allows for real-time monitoring and visualization of network activity, system logs, and security alerts. This allows security professionals to immediately identify threats, monitor suspicious activities, and fix any issues that develop. This data can be displayed in an aesthetically pleasing and intuitively structured format using augmented reality interfaces, enabling for faster analysis and decision-making. △ Less

Submitted 30 January, 2024; originally announced April 2024.

Comments: Cybersecurity. Augmented Reality on, Virtual Reality Implementation errors, Data security and efficiency

arXiv:2401.17705 [pdf]

Predicting suicidal behavior among Indian adults using childhood trauma, mental health questionnaires and machine learning cascade ensembles

Authors: Akash K Rao, Gunjan Y Trivedi, Riri G Trivedi, Anshika Bajpai, Gajraj Singh Chauhan, Vishnu K Menon, Kathirvel Soundappan, Hemalatha Ramani, Neha Pandya, Varun Dutt

Abstract: Among young adults, suicide is India's leading cause of death, accounting for an alarming national suicide rate of around 16%. In recent years, machine learning algorithms have emerged to predict suicidal behavior using various behavioral traits. But to date, the efficacy of machine learning algorithms in predicting suicidal behavior in the Indian context has not been explored in literature. In th… ▽ More Among young adults, suicide is India's leading cause of death, accounting for an alarming national suicide rate of around 16%. In recent years, machine learning algorithms have emerged to predict suicidal behavior using various behavioral traits. But to date, the efficacy of machine learning algorithms in predicting suicidal behavior in the Indian context has not been explored in literature. In this study, different machine learning algorithms and ensembles were developed to predict suicide behavior based on childhood trauma, different mental health parameters, and other behavioral factors. The dataset was acquired from 391 individuals from a wellness center in India. Information regarding their childhood trauma, psychological wellness, and other mental health issues was acquired through standardized questionnaires. Results revealed that cascade ensemble learning methods using a support vector machine, decision trees, and random forest were able to classify suicidal behavior with an accuracy of 95.04% using data from childhood trauma and mental health questionnaires. The study highlights the potential of using these machine learning ensembles to identify individuals with suicidal tendencies so that targeted interinterventions could be provided efficiently. △ Less

Submitted 31 January, 2024; originally announced January 2024.

Comments: 11 pages, presnted at the 4th International Conference on Frontiers in Computing and Systems (COMSYS 2023), Himachal Pradesh, October 2023

arXiv:2311.11521 [pdf]

On the Effective throughput of Shadowed Beaulieu-Xie fading channel

Authors: Manpreet Kaur, Sandeep Kumar, Poonam Yadav, Puspraj Singh Chauhan

Abstract: Given the imperative for advanced wireless networks in the next generation and the rise of real-time applications within wireless communication, there is a notable focus on investigating data rate performance across various fading scenarios. This research delved into analyzing the effective throughput of the shadowed Beaulieu-Xie (SBX) composite fading channel using the PDF-based approach. To get… ▽ More Given the imperative for advanced wireless networks in the next generation and the rise of real-time applications within wireless communication, there is a notable focus on investigating data rate performance across various fading scenarios. This research delved into analyzing the effective throughput of the shadowed Beaulieu-Xie (SBX) composite fading channel using the PDF-based approach. To get the simplified relationship between the performance parameter and channel parameters, the low-SNR and the high-SNR approximation of the effective rate are also provided. The proposed formulations are evaluated for different values of system parameters to study their impact on the effective throughput. Also, the impact of the delay parameter on the EC is investigated. Monte-Carlo simulations are used to verify the facticity of the deduced equations. △ Less

Submitted 19 November, 2023; originally announced November 2023.

Comments: 18 pages

arXiv:2310.03200 [pdf]

Amazon Books Rating prediction & Recommendation Model

Authors: Hsiu-Ping Lin, Suman Chauhan, Yougender Chauhan, Nagender Chauhan, Jongwook Woo

Abstract: This paper uses the dataset of Amazon to predict the books ratings listed on Amazon website. As part of this project, we predicted the ratings of the books, and also built a recommendation cluster. This recommendation cluster provides the recommended books based on the column's values from dataset, for instance, category, description, author, price, reviews etc. This paper provides a flow of handl… ▽ More This paper uses the dataset of Amazon to predict the books ratings listed on Amazon website. As part of this project, we predicted the ratings of the books, and also built a recommendation cluster. This recommendation cluster provides the recommended books based on the column's values from dataset, for instance, category, description, author, price, reviews etc. This paper provides a flow of handling big data files, data engineering, building models and providing predictions. The models predict book ratings column using various PySpark Machine Learning APIs. Additionally, we used hyper-parameters and parameters tuning. Also, Cross Validation and TrainValidationSplit were used for generalization. Finally, we performed a comparison between Binary Classification and Multiclass Classification in their accuracies. We converted our label from multiclass to binary to see if we could find any difference between the two classifications. As a result, we found out that we get higher accuracy in binary classification than in multiclass classification. △ Less

Submitted 4 October, 2023; originally announced October 2023.

Comments: 5 pages, 4 figures, 8 tables

arXiv:2305.01484 [pdf, other]

Powering Disturb-Free Reconfigurable Computing and Tunable Analog Electronics with Dual-Port Ferroelectric FET

Authors: Zijian Zhao, Shan Deng, Swetaki Chatterjee, Zhouhang Jiang, Muhammad Shaffatul Islam, Yi Xiao, Yixin Xu, Scott Meninger, Mohamed Mohamed, Rajiv Joshi, Yogesh Singh Chauhan, Halid Mulaosmanovic, Stefan Duenkel, Dominik Kleimaier, Sven Beyer, Hussam Amrouch, Vijaykrishnan Narayanan, Kai Ni

Abstract: Single-port ferroelectric FET (FeFET) that performs write and read operations on the same electrical gate prevents its wide application in tunable analog electronics and suffers from read disturb, especially to the high-threshold voltage (VTH) state as the retention energy barrier is reduced by the applied read bias. To address both issues, we propose to adopt a read disturb-free dual-port FeFET w… ▽ More Single-port ferroelectric FET (FeFET) that performs write and read operations on the same electrical gate prevents its wide application in tunable analog electronics and suffers from read disturb, especially to the high-threshold voltage (VTH) state as the retention energy barrier is reduced by the applied read bias. To address both issues, we propose to adopt a read disturb-free dual-port FeFET where write is performed on the gate featuring a ferroelectric layer and the read is done on a separate gate featuring a non-ferroelectric dielectric. Combining the unique structure and the separate read gate, read disturb is eliminated as the applied field is aligned with polarization in the high-VTH state and thus improving its stability, while it is screened by the channel inversion charge and exerts no negative impact on the low-VTH state stability. Comprehensive theoretical and experimental validation have been performed on fully-depleted silicon-on-insulator (FDSOI) FeFETs integrated on 22 nm platform, which intrinsically has dual ports with its buried oxide layer acting as the non-ferroelectric dielectric. Novel applications that can exploit the proposed dual-port FeFET are proposed and experimentally demonstrated for the first time, including FPGA that harnesses its read disturb-free feature and tunable analog electronics (e.g., frequency tunable ring oscillator in this work) leveraging the separated write and read paths. △ Less

Submitted 2 May, 2023; originally announced May 2023.

Comments: 32 pages

arXiv:2304.03868 [pdf, other]

Compact and High-Performance TCAM Based on Scaled Double-Gate FeFETs

Authors: Liu Liu, Shubham Kumar, Simon Thomann, Yogesh Singh Chauhan, Hussam Amrouch, Xiaobo Sharon Hu

Abstract: Ternary content addressable memory (TCAM), widely used in network routers and high-associativity caches, is gaining popularity in machine learning and data-analytic applications. Ferroelectric FETs (FeFETs) are a promising candidate for implementing TCAM owing to their high ON/OFF ratio, non-volatility, and CMOS compatibility. However, conventional single-gate FeFETs (SG-FeFETs) suffer from relati… ▽ More Ternary content addressable memory (TCAM), widely used in network routers and high-associativity caches, is gaining popularity in machine learning and data-analytic applications. Ferroelectric FETs (FeFETs) are a promising candidate for implementing TCAM owing to their high ON/OFF ratio, non-volatility, and CMOS compatibility. However, conventional single-gate FeFETs (SG-FeFETs) suffer from relatively high write voltage, low endurance, potential read disturbance, and face scaling challenges. Recently, a double-gate FeFET (DG-FeFET) has been proposed and outperforms SG-FeFETs in many aspects. This paper investigates TCAM design challenges specific to DG-FeFETs and introduces a novel 1.5T1Fe TCAM design based on DG-FeFETs. A 2-step search with early termination is employed to reduce the cell area and improve energy efficiency. A shared driver design is proposed to reduce the peripherals area. Detailed analysis and SPICE simulation show that the 1.5T1Fe DG-TCAM leads to superior search speed and energy efficiency. The 1.5T1Fe TCAM design can also be built with SG-FeFETs, which achieve search latency and energy improvement compared with 2FeFET TCAM. △ Less

Submitted 13 April, 2023; v1 submitted 7 April, 2023; originally announced April 2023.

Comments: Accepted by Design Automation Conference (DAC) 2023

arXiv:2301.04709 [pdf, other]

Causal Abstraction: A Theoretical Foundation for Mechanistic Interpretability

Authors: Atticus Geiger, Duligur Ibeling, Amir Zur, Maheep Chaudhary, Sonakshi Chauhan, Jing Huang, Aryaman Arora, Zhengxuan Wu, Noah Goodman, Christopher Potts, Thomas Icard

Abstract: Causal abstraction provides a theoretical foundation for mechanistic interpretability, the field concerned with providing intelligible algorithms that are faithful simplifications of the known, but opaque low-level details of black box AI models. Our contributions are (1) generalizing the theory of causal abstraction from mechanism replacement (i.e., hard and soft interventions) to arbitrary mecha… ▽ More Causal abstraction provides a theoretical foundation for mechanistic interpretability, the field concerned with providing intelligible algorithms that are faithful simplifications of the known, but opaque low-level details of black box AI models. Our contributions are (1) generalizing the theory of causal abstraction from mechanism replacement (i.e., hard and soft interventions) to arbitrary mechanism transformation (i.e., functionals from old mechanisms to new mechanisms), (2) providing a flexible, yet precise formalization for the core concepts of polysemantic neurons, the linear representation hypothesis, modular features, and graded faithfulness, and (3) unifying a variety of mechanistic interpretability methods in the common language of causal abstraction, namely, activation and path patching, causal mediation analysis, causal scrubbing, causal tracing, circuit analysis, concept erasure, sparse autoencoders, differential binary masking, distributed alignment search, and steering. △ Less

Submitted 8 May, 2025; v1 submitted 11 January, 2023; originally announced January 2023.

arXiv:2208.12618 [pdf]

COVID Future Panel Survey: A Unique Public Dataset Documenting How U.S. Residents' Travel Related Choices Changed During the COVID-19 Pandemic

Authors: Rishabh Singh Chauhan, Matthew Wigginton Bhagat-Conway, Tassio Magassy, Nicole Corcoran, Ehsan Rahimi, Abbie Dirks, Ram Pendyala, Abolfazl Mohammadian, Sybil Derrible, Deborah Salon

Abstract: The COVID-19 pandemic is an unprecedented global crisis that has impacted virtually everyone. We conducted a nationwide online longitudinal survey in the United States to collect information about the shifts in travel-related behavior and attitudes before, during, and after the pandemic. The survey asked questions about commuting, long distance travel, working from home, online learning, online sh… ▽ More The COVID-19 pandemic is an unprecedented global crisis that has impacted virtually everyone. We conducted a nationwide online longitudinal survey in the United States to collect information about the shifts in travel-related behavior and attitudes before, during, and after the pandemic. The survey asked questions about commuting, long distance travel, working from home, online learning, online shopping, pandemic experiences, attitudes, and demographic information. The survey has been deployed to the same respondents thrice to observe how the responses to the pandemic have evolved over time. The first wave of the survey was conducted from April 2020 to June 2021, the second wave from November 2020 to August 2021, and the third wave from October 2021 to November 2021. In total, 9,265 responses were collected in the first wave; of these, 2,877 respondents returned for the second wave and 2,728 for the third wave. Survey data are publicly available. This unique dataset can aid policy makers in making decisions in areas including transport, workforce development, and more. This article demonstrates the framework for conducting this online longitudinal survey. It details the step-by-step procedure involved in conducting the survey and in curating the data to make it representative of the national trends. △ Less

Submitted 10 August, 2022; originally announced August 2022.

arXiv:2208.10488 [pdf]

Friendliness Of Stack Overflow Towards Newbies

Authors: Aneesh Tickoo, Shweta Chauhan, Gagan Raj Gupta

Abstract: In today's modern digital world, we have a number of online Question and Answer platforms like Stack Exchange, Quora, and GFG that serve as a medium for people to communicate and help each other. In this paper, we analyzed the effectiveness of Stack Overflow in helping newbies to programming. Every user on this platform goes through a journey. For the first 12 months, we consider them to be a newb… ▽ More In today's modern digital world, we have a number of online Question and Answer platforms like Stack Exchange, Quora, and GFG that serve as a medium for people to communicate and help each other. In this paper, we analyzed the effectiveness of Stack Overflow in helping newbies to programming. Every user on this platform goes through a journey. For the first 12 months, we consider them to be a newbie. Post 12 months they come under one of the following categories: Experienced, Lurkers, or Inquisitive. Each question asked has tags assigned to it and we observe that questions with some specific tags have a faster response time indicating an active community in that field over others. The platform had a steady growth up to 2013 after which it started declining, but recently during the pandemic 2020, we can see rejuvenated activity on the platform. △ Less

Submitted 21 August, 2022; originally announced August 2022.

Comments: 12 pages, International Conference on Sustainable Future: Innovations in Education

arXiv:2208.08642 [pdf, ps, other]

On the Integral and Derivative Identities of Bivariate Fox H-Function: Application in Wireless System Performance Analysis

Authors: Puspraj Singh Chauhan, Sandeep Kumar, Imran Shafique Ansari

Abstract: The present work proposes analytical solutions for the integral of bivariate Fox H-function in combination with algebraic, exponential, and complementary error functions. In addition, the work also presents the derivative identities with respect to function arguments. Further, the suitability of the proposed mathematical solutions is verified with reference to wireless communication environment, w… ▽ More The present work proposes analytical solutions for the integral of bivariate Fox H-function in combination with algebraic, exponential, and complementary error functions. In addition, the work also presents the derivative identities with respect to function arguments. Further, the suitability of the proposed mathematical solutions is verified with reference to wireless communication environment, where a fading behaviour of the channel acquired the bivariate Fox H-function structure. Further more, asymptotic results for the outage probability and average symbol error probability are presented utilizing the origin probability density function based approach. The obtained results are free from complex analytical functions. At last, the analytical findings of the paper are compared with the numerical results and also with the Monte-Carlo simulation results to confirm their accuracy. △ Less

Submitted 18 August, 2022; originally announced August 2022.

Comments: submitted to IEEE Journal for review

arXiv:2204.02573 [pdf]

Detecting key Soccer match events to create highlights using Computer Vision

Authors: Narayana Darapaneni, Prashant Kumar, Nikhil Malhotra, Vigneswaran Sundaramurthy, Abhaya Thakur, Shivam Chauhan, Krishna Chaitanya Thangeda, Anwesh Reddy Paduri

Abstract: The research and data science community has been fascinated with the development of automatic systems for the detection of key events in a video. Special attention in this field is given to sports video analytics which could help in identifying key events during a match and help in preparing a strategy for the games going forward. For this paper, we have chosen Football (soccer) as a sport where w… ▽ More The research and data science community has been fascinated with the development of automatic systems for the detection of key events in a video. Special attention in this field is given to sports video analytics which could help in identifying key events during a match and help in preparing a strategy for the games going forward. For this paper, we have chosen Football (soccer) as a sport where we would want to create highlights for a given match video, through a computer vision model that aims to identify important events in a Soccer match to create highlights of the match. We built the models based on Faster RCNN and YoloV5 architectures and noticed that for the amount of data we used for training Faster RCNN did better than YoloV5 in detecting the events in the match though it was much slower. Within Faster RCNN using ResNet50 as a base model gave a better class accuracy of 95.5% as compared to 92% with VGG16 as base model completely outperforming YoloV5 for our training dataset. We tested with an original video of size 23 minutes and our model could reduce it to 4:50 minutes of highlights capturing almost all important events in the match. △ Less

Submitted 6 April, 2022; originally announced April 2022.

arXiv:2112.13511 [pdf]

Design, Manufacturing, and Controls of a Prismatic Quadruped Robot: PRISMA

Authors: Team Robocon, IIT Roorkee, :, Bhavya Giri Goswami, Aman Verma, Gautam Jha, Vandan Gajjar, Vedant Neekhra, Utkarsh Deepak, Aayush Singh Chauhan

Abstract: Most of the quadrupeds developed are highly actuated, and their control is hence quite cumbersome. They need advanced electronics equipment to solve convoluted inverse kinematic equations continuously. In addition, they demand special and costly sensors to autonomously navigate through the environment as traditional distance sensors usually fail because of the continuous perturbation due to the mo… ▽ More Most of the quadrupeds developed are highly actuated, and their control is hence quite cumbersome. They need advanced electronics equipment to solve convoluted inverse kinematic equations continuously. In addition, they demand special and costly sensors to autonomously navigate through the environment as traditional distance sensors usually fail because of the continuous perturbation due to the motion of the robot. Another challenge is maintaining the continuous dynamic stability of the robot while walking, which requires complicated and state-of-the-art control algorithms. This paper presents a thorough description of the hardware design and control architecture of our in-house prismatic joint quadruped robot called the PRISMA. We aim to forge a robust and kinematically stable quadruped robot that can use elementary control algorithms and utilize conventional sensors to navigate an unknown environment. We discuss the benefits and limitations of the robot in terms of its motion, different foot trajectories, manufacturability, and controls. △ Less

Submitted 26 December, 2021; originally announced December 2021.

Comments: 14 pages, 16 figures, 4 tables

arXiv:2111.08361 [pdf, ps, other]

From Convolutions towards Spikes: The Environmental Metric that the Community currently Misses

Authors: Aviral Chharia, Shivu Chauhan, Rahul Upadhyay, Vinay Kumar

Abstract: Today, the AI community is obsessed with 'state-of-the-art' scores (80% papers in NeurIPS) as the major performance metrics, due to which an important parameter, i.e., the environmental metric, remains unreported. Computational capabilities were a limiting factor a decade ago; however, in foreseeable future circumstances, the challenge will be to develop environment-friendly and power-efficient al… ▽ More Today, the AI community is obsessed with 'state-of-the-art' scores (80% papers in NeurIPS) as the major performance metrics, due to which an important parameter, i.e., the environmental metric, remains unreported. Computational capabilities were a limiting factor a decade ago; however, in foreseeable future circumstances, the challenge will be to develop environment-friendly and power-efficient algorithms. The human brain, which has been optimizing itself for almost a million years, consumes the same amount of power as a typical laptop. Therefore, developing nature-inspired algorithms is one solution to it. In this study, we show that currently used ANNs are not what we find in nature, and why, although having lower performance, spiking neural networks, which mirror the mammalian visual cortex, have attracted much interest. We further highlight the hardware gaps restricting the researchers from using spike-based computation for developing neuromorphic energy-efficient microchips on a large scale. Using neuromorphic processors instead of traditional GPUs might be more environment friendly and efficient. These processors will turn SNNs into an ideal solution for the problem. This paper presents in-depth attention highlighting the current gaps, the lack of comparative research, while proposing new research directions at the intersection of two fields -- neuroscience and deep learning. Further, we define a new evaluation metric 'NATURE' for reporting the carbon footprint of AI models. △ Less

Submitted 16 November, 2021; originally announced November 2021.

Comments: NeurIPS 2021 Human-Centered AI Workshop

arXiv:2108.01260 [pdf, other]

M2H2: A Multimodal Multiparty Hindi Dataset For Humor Recognition in Conversations

Authors: Dushyant Singh Chauhan, Gopendra Vikram Singh, Navonil Majumder, Amir Zadeh, Asif Ekbal, Pushpak Bhattacharyya, Louis-philippe Morency, Soujanya Poria

Abstract: Humor recognition in conversations is a challenging task that has recently gained popularity due to its importance in dialogue understanding, including in multimodal settings (i.e., text, acoustics, and visual). The few existing datasets for humor are mostly in English. However, due to the tremendous growth in multilingual content, there is a great demand to build models and systems that support m… ▽ More Humor recognition in conversations is a challenging task that has recently gained popularity due to its importance in dialogue understanding, including in multimodal settings (i.e., text, acoustics, and visual). The few existing datasets for humor are mostly in English. However, due to the tremendous growth in multilingual content, there is a great demand to build models and systems that support multilingual information access. To this end, we propose a dataset for Multimodal Multiparty Hindi Humor (M2H2) recognition in conversations containing 6,191 utterances from 13 episodes of a very popular TV series "Shrimaan Shrimati Phir Se". Each utterance is annotated with humor/non-humor labels and encompasses acoustic, visual, and textual modalities. We propose several strong multimodal baselines and show the importance of contextual and multimodal information for humor recognition in conversations. The empirical results on M2H2 dataset demonstrate that multimodal information complements unimodal information for humor recognition. The dataset and the baselines are available at http://www.iitp.ac.in/~ai-nlp-ml/resources.html and https://github.com/declare-lab/M2H2-dataset. △ Less

Submitted 2 August, 2021; originally announced August 2021.

Comments: ICMI 2021

arXiv:2107.07729 [pdf, other]

Semi-supervised Learning for Marked Temporal Point Processes

Authors: Shivshankar Reddy, Anand Vir Singh Chauhan, Maneet Singh, Karamjit Singh

Abstract: Temporal Point Processes (TPPs) are often used to represent the sequence of events ordered as per the time of occurrence. Owing to their flexible nature, TPPs have been used to model different scenarios and have shown applicability in various real-world applications. While TPPs focus on modeling the event occurrence, Marked Temporal Point Process (MTPP) focuses on modeling the category/class of th… ▽ More Temporal Point Processes (TPPs) are often used to represent the sequence of events ordered as per the time of occurrence. Owing to their flexible nature, TPPs have been used to model different scenarios and have shown applicability in various real-world applications. While TPPs focus on modeling the event occurrence, Marked Temporal Point Process (MTPP) focuses on modeling the category/class of the event as well (termed as the marker). Research in MTPP has garnered substantial attention over the past few years, with an extensive focus on supervised algorithms. Despite the research focus, limited attention has been given to the challenging problem of developing solutions in semi-supervised settings, where algorithms have access to a mix of labeled and unlabeled data. This research proposes a novel algorithm for Semi-supervised Learning for Marked Temporal Point Processes (SSL-MTPP) applicable in such scenarios. The proposed SSL-MTPP algorithm utilizes a combination of labeled and unlabeled data for learning a robust marker prediction model. The proposed algorithm utilizes an RNN-based Encoder-Decoder module for learning effective representations of the time sequence. The efficacy of the proposed algorithm has been demonstrated via multiple protocols on the Retweet dataset, where the proposed SSL-MTPP demonstrates improved performance in comparison to the traditional supervised learning approach. △ Less

Submitted 16 July, 2021; originally announced July 2021.

arXiv:2104.08741 [pdf, other]

CEAR: Cross-Entity Aware Reranker for Knowledge Base Completion

Authors: Keshav Kolluru, Mayank Singh Chauhan, Yatin Nandwani, Parag Singla, Mausam

Abstract: Pre-trained language models (LMs) like BERT have shown to store factual knowledge about the world. This knowledge can be used to augment the information present in Knowledge Bases, which tend to be incomplete. However, prior attempts at using BERT for task of Knowledge Base Completion (KBC) resulted in performance worse than embedding based techniques that rely only on the graph structure. In this… ▽ More Pre-trained language models (LMs) like BERT have shown to store factual knowledge about the world. This knowledge can be used to augment the information present in Knowledge Bases, which tend to be incomplete. However, prior attempts at using BERT for task of Knowledge Base Completion (KBC) resulted in performance worse than embedding based techniques that rely only on the graph structure. In this work we develop a novel model, Cross-Entity Aware Reranker (CEAR), that uses BERT to re-rank the output of existing KBC models with cross-entity attention. Unlike prior work that scores each entity independently, CEAR uses BERT to score the entities together, which is effective for exploiting its factual knowledge. CEAR achieves a new state of art for the OLPBench dataset. △ Less

Submitted 27 January, 2022; v1 submitted 18 April, 2021; originally announced April 2021.

Comments: We found a bug in the code that invalidates the reported results for FB15k-237 and WN18RR. The results for OLPBench hold the same. We are in process of updating the paper

arXiv:2104.08494 [pdf, other]

Blockchain-Enabled End-to-End Encryption for Instant Messaging Applications

Authors: Raman Singh, Ark Nandan Singh Chauhan, Hitesh Tewari

Abstract: In the era of social media and messaging applications, people are becoming increasingly aware of data privacy issues associated with such apps. Major messaging applications are moving towards end-to-end encryption (E2EE) to give their users the privacy they are demanding. However the current security mechanisms employed by different service providers are not unfeigned E2EE implementations, and are… ▽ More In the era of social media and messaging applications, people are becoming increasingly aware of data privacy issues associated with such apps. Major messaging applications are moving towards end-to-end encryption (E2EE) to give their users the privacy they are demanding. However the current security mechanisms employed by different service providers are not unfeigned E2EE implementations, and are blended with many vulnerabilities. In the present scenario, the major part of the E2EE mechanism is controlled by the service provider's servers, and the decryption keys are stored by them in case of backup restoration. These shortcomings diminish the user's confidence in the privacy of their data while using these apps. A public Key infrastructure (PKI) mechanism can be used to circumvent some of these issues, but it comes with high monetary costs, which makes it impossible to roll out for millions of users. The paper proposes a blockchain-based E2EE framework that can mitigate the contemporary vulnerabilities in messaging applications. The user's device generates the public/private key pair during application installation, and asks its mobile network operator (MNO) to issue a digital certificate and store it on the blockchain. A user can fetch a certificate for another user from the chat server and communicate securely with them using a ratchet forward encryption mechanism. △ Less

Submitted 30 July, 2021; v1 submitted 17 April, 2021; originally announced April 2021.

arXiv:2103.11596 [pdf]

Monolingual and Parallel Corpora for Kangri Low Resource Language

Authors: Shweta Chauhan, Shefali Saxena, Philemon Daniel

Abstract: In this paper we present the dataset of Himachali low resource endangered language, Kangri (ISO 639-3xnr) listed in the United Nations Educational, Scientific and Cultural Organization (UNESCO). The compilation of kangri corpus has been a challenging task due to the non-availability of the digitalized resources. The corpus contains 1,81,552 Monolingual and 27,362 Hindi-Kangri Parallel corpora. We… ▽ More In this paper we present the dataset of Himachali low resource endangered language, Kangri (ISO 639-3xnr) listed in the United Nations Educational, Scientific and Cultural Organization (UNESCO). The compilation of kangri corpus has been a challenging task due to the non-availability of the digitalized resources. The corpus contains 1,81,552 Monolingual and 27,362 Hindi-Kangri Parallel corpora. We shared pre-trained kangri word embeddings. We also reported the Bilingual Evaluation Understudy (BLEU) score and Metric for Evaluation of Translation with Explicit ORdering (METEOR) score of Statistical Machine Translation (SMT) and Neural Machine Translation (NMT) results for the corpus. The corpus is freely available for non-commercial usages and research. To the best of our knowledge, this is the first Himachali low resource endangered language corpus. The resources are available at (https://github.com/chauhanshweta/Kangri_corpus) △ Less

Submitted 22 March, 2021; originally announced March 2021.

Comments: 7 pages, 6 Tables, 1 Figure

arXiv:2101.11881 [pdf, other]

doi 10.1371/journal.pone.0262708

Deep learning via LSTM models for COVID-19 infection forecasting in India

Authors: Rohitash Chandra, Ayush Jain, Divyanshu Singh Chauhan

Abstract: The COVID-19 pandemic continues to have major impact to health and medical infrastructure, economy, and agriculture. Prominent computational and mathematical models have been unreliable due to the complexity of the spread of infections. Moreover, lack of data collection and reporting makes modelling attempts difficult and unreliable. Hence, we need to re-look at the situation with reliable data so… ▽ More The COVID-19 pandemic continues to have major impact to health and medical infrastructure, economy, and agriculture. Prominent computational and mathematical models have been unreliable due to the complexity of the spread of infections. Moreover, lack of data collection and reporting makes modelling attempts difficult and unreliable. Hence, we need to re-look at the situation with reliable data sources and innovative forecasting models. Deep learning models such as recurrent neural networks are well suited for modelling spatiotemporal sequences. In this paper, we apply recurrent neural networks such as long short term memory (LSTM), bidirectional LSTM, and encoder-decoder LSTM models for multi-step (short-term) COVID-19 infection forecasting. We select Indian states with COVID-19 hotpots and capture the first (2020) and second (2021) wave of infections and provide two months ahead forecast. Our model predicts that the likelihood of another wave of infections in October and November 2021 is low; however, the authorities need to be vigilant given emerging variants of the virus. The accuracy of the predictions motivate the application of the method in other countries and regions. Nevertheless, the challenges in modelling remain due to the reliability of data and difficulties in capturing factors such as population density, logistics, and social aspects such as culture and lifestyle. △ Less

Submitted 17 January, 2022; v1 submitted 28 January, 2021; originally announced January 2021.

Comments: PLOS One, 2022

arXiv:2009.10400 [pdf]

doi 10.1016/j.cmpb.2020.105789

Towards real-time finite-strain anisotropic thermo-visco-elastodynamic analysis of soft tissues for thermal ablative therapy

Authors: Jinao Zhang, Remi Jacob Lay, Stuart K. Roberts, Sunita Chauhan

Abstract: Accurate and efficient prediction of soft tissue temperatures is essential to computer-assisted treatment systems for thermal ablation. It can be used to predict tissue temperatures and ablation volumes for personalised treatment planning and image-guided intervention. Numerically, it requires full nonlinear modelling of the coupled computational bioheat transfer and biomechanics, and efficient so… ▽ More Accurate and efficient prediction of soft tissue temperatures is essential to computer-assisted treatment systems for thermal ablation. It can be used to predict tissue temperatures and ablation volumes for personalised treatment planning and image-guided intervention. Numerically, it requires full nonlinear modelling of the coupled computational bioheat transfer and biomechanics, and efficient solution procedures; however, existing studies considered the bioheat analysis alone or the coupled linear analysis, without the fully coupled nonlinear analysis. We present a coupled thermo-visco-hyperelastic finite element algorithm, based on finite-strain thermoelasticity and total Lagrangian explicit dynamics. It considers the coupled nonlinear analysis of (i) bioheat transfer under soft tissue deformations and (ii) soft tissue deformations due to thermal expansion/shrinkage. The presented method accounts for anisotropic, finite-strain, temperature-dependent, thermal, and viscoelastic behaviours of soft tissues, and it is implemented using GPU acceleration for real-time computation. We also demonstrate the translational benefits of the presented method for clinical applications using a simulation of thermal ablation in the liver. The key advantage of the presented method is that it enables full nonlinear modelling of the anisotropic, finite-strain, temperature-dependent, thermal, and viscoelastic behaviours of soft tissues, instead of linear elastic, linear viscoelastic, and thermal-only modelling in the existing methods. It also provides high computational speeds for computer-assisted treatment systems towards enabling the operator to simulate thermal ablation accurately and visualise tissue temperatures and ablation zones immediately. △ Less

Submitted 31 December, 2021; v1 submitted 22 September, 2020; originally announced September 2020.

Comments: Submitted to Computer Methods and Programs in Biomedicine

Journal ref: Computer Methods and Programs in Biomedicine, vol. 198, pp. 105789, 2021

arXiv:2009.06451 [pdf, other]

Development of a Dataset and a Deep Learning Baseline Named Entity Recognizer for Three Low Resource Languages: Bhojpuri, Maithili and Magahi

Authors: Rajesh Kumar Mundotiya, Shantanu Kumar, Ajeet kumar, Umesh Chandra Chaudhary, Supriya Chauhan, Swasti Mishra, Praveen Gatla, Anil Kumar Singh

Abstract: In Natural Language Processing (NLP) pipelines, Named Entity Recognition (NER) is one of the preliminary problems, which marks proper nouns and other named entities such as Location, Person, Organization, Disease etc. Such entities, without a NER module, adversely affect the performance of a machine translation system. NER helps in overcoming this problem by recognising and handling such entities… ▽ More In Natural Language Processing (NLP) pipelines, Named Entity Recognition (NER) is one of the preliminary problems, which marks proper nouns and other named entities such as Location, Person, Organization, Disease etc. Such entities, without a NER module, adversely affect the performance of a machine translation system. NER helps in overcoming this problem by recognising and handling such entities separately, although it can be useful in Information Extraction systems also. Bhojpuri, Maithili and Magahi are low resource languages, usually known as Purvanchal languages. This paper focuses on the development of a NER benchmark dataset for the Machine Translation systems developed to translate from these languages to Hindi by annotating parts of their available corpora. Bhojpuri, Maithili and Magahi corpora of sizes 228373, 157468 and 56190 tokens, respectively, were annotated using 22 entity labels. The annotation considers coarse-grained annotation labels followed by the tagset used in one of the Hindi NER datasets. We also report a Deep Learning based baseline that uses an LSTM-CNNs-CRF model. The lower baseline F1-scores from the NER tool obtained by using Conditional Random Fields models are 96.73 for Bhojpuri, 93.33 for Maithili and 95.04 for Magahi. The Deep Learning-based technique (LSTM-CNNs-CRF) achieved 96.25 for Bhojpuri, 93.33 for Maithili and 95.44 for Magahi. △ Less

Submitted 14 September, 2020; originally announced September 2020.

Comments: 34 pages; 7 figures

arXiv:2007.03987 [pdf, other]

doi 10.1109/MM.2020.3005883

Power Side-Channel Attacks in Negative Capacitance Transistor (NCFET)

Authors: Johann Knechtel, Satwik Patnaik, Mohammed Nabeel, Mohammed Ashraf, Yogesh S. Chauhan, Jörg Henkel, Ozgur Sinanoglu, Hussam Amrouch

Abstract: Side-channel attacks have empowered bypassing of cryptographic components in circuits. Power side-channel (PSC) attacks have received particular traction, owing to their non-invasiveness and proven effectiveness. Aside from prior art focused on conventional technologies, this is the first work to investigate the emerging Negative Capacitance Transistor (NCFET) technology in the context of PSC atta… ▽ More Side-channel attacks have empowered bypassing of cryptographic components in circuits. Power side-channel (PSC) attacks have received particular traction, owing to their non-invasiveness and proven effectiveness. Aside from prior art focused on conventional technologies, this is the first work to investigate the emerging Negative Capacitance Transistor (NCFET) technology in the context of PSC attacks. We implement a CAD flow for PSC evaluation at design-time. It leverages industry-standard design tools, while also employing the widely-accepted correlation power analysis (CPA) attack. Using standard-cell libraries based on the 7nm FinFET technology for NCFET and its counterpart CMOS setup, our evaluation reveals that NCFET-based circuits are more resilient to the classical CPA attack, due to the considerable effect of negative capacitance on the switching power. We also demonstrate that the thicker the ferroelectric layer, the higher the resiliency of the NCFET-based circuit, which opens new doors for optimization and trade-offs. △ Less

Submitted 8 July, 2020; originally announced July 2020.

arXiv:2006.09290 [pdf, other]

doi 10.1007/s10836-019-05849-1

Novel Randomized Placement for FPGA Based Robust ROPUF with Improved Uniqueness

Authors: Arjun Singh Chauhan, Vineet Sahula, Atanendu Sekhar Mandal

Abstract: The physical unclonable functions (PUF) are used to provide software as well as hardware security for the cyber-physical systems. They have been used for performing significant cryptography tasks such as generating keys, device authentication, securing against IP piracy, and to produce the root of trust as well. However, they lack in reliability metric. We present a novel approach for improving th… ▽ More The physical unclonable functions (PUF) are used to provide software as well as hardware security for the cyber-physical systems. They have been used for performing significant cryptography tasks such as generating keys, device authentication, securing against IP piracy, and to produce the root of trust as well. However, they lack in reliability metric. We present a novel approach for improving the reliability as well as the uniqueness of the field programmable gated arrays (FPGAs) based ring oscillator PUF and derive a random number, consuming very small area (< 1%) concerning look-up tables (LUTs). We use frequency profiling method for distributing frequency variations in ring oscillators (RO), spatially placed all across the FPGA floor. We are able to spot suitable locations for RO mapping, which leads to enhanced ROPUF reliability. We have evaluated the proposed methodology on Xilinx -7 series FPGAs and tested the robustness against environmental variations, e.g. temperature and supply voltage variations, simultaneously. The proposed approach achieves significant improvement (i) in uniqueness value upto 49:90%, within 0.1% of the theoretical value (ii) in the reliability value upto 99:70%, which signifies that less than 1 bit flipping has been observed on average, and (iii) in randomness, signified by passing NIST test suite. The response generated through the ROPUF passes all the applicable relevant tests of NIST uniformity statistical test suite. △ Less

Submitted 7 June, 2020; originally announced June 2020.

ACM Class: E.3

Journal ref: Journal of Electronic Testing volume 35, pages 581 to 601 (2019)

arXiv:2002.02084 [pdf, other]

doi 10.1109/ISGT-Europe47291.2020.9248952

A Stochastic Game Framework for Efficient Energy Management in Microgrid Networks

Authors: Shravan Nayak, Chanakya Ajit Ekbote, Annanya Pratap Singh Chauhan, Raghuram Bharadwaj Diddigi, Prishita Ray, Abhinava Sikdar, Sai Koti Reddy Danda, Shalabh Bhatnagar

Abstract: We consider the problem of energy management in microgrid networks. A microgrid is capable of generating a limited amount of energy from a renewable resource and is responsible for handling the demands of its dedicated customers. Owing to the variable nature of renewable generation and the demands of the customers, it becomes imperative that each microgrid optimally manages its energy. This involv… ▽ More We consider the problem of energy management in microgrid networks. A microgrid is capable of generating a limited amount of energy from a renewable resource and is responsible for handling the demands of its dedicated customers. Owing to the variable nature of renewable generation and the demands of the customers, it becomes imperative that each microgrid optimally manages its energy. This involves intelligently scheduling the demands at the customer side, selling (when there is a surplus) and buying (when there is a deficit) the power from its neighboring microgrids depending on its current and future needs. Typically, the transaction of power among the microgrids happens at a pre-decided price by the central grid. In this work, we formulate the problems of demand and battery scheduling, energy trading and dynamic pricing (where we allow the microgrids to decide the price of the transaction depending on their current configuration of demand and renewable energy) in the framework of stochastic games. Subsequently, we propose a novel approach that makes use of independent learners Deep Q-learning algorithm to solve this problem. Through extensive empirical evaluation, we show that our proposed framework is more beneficial to the majority of the microgrids and we provide a detailed analysis of the results. △ Less

Submitted 15 November, 2020; v1 submitted 5 February, 2020; originally announced February 2020.

arXiv:1912.06991 [pdf]

Applying Deep Learning to Detect Traffic Accidents in Real Time Using Spatiotemporal Sequential Data

Authors: Amir Bahador Parsa, Rishabh Singh Chauhan, Homa Taghipour, Sybil Derrible, Abolfazl Mohammadian

Abstract: Accident detection is a vital part of traffic safety. Many road users suffer from traffic accidents, as well as their consequences such as delay, congestion, air pollution, and so on. In this study, we utilize two advanced deep learning techniques, Long Short-Term Memory (LSTM) and Gated Recurrent Units (GRUs), to detect traffic accidents in Chicago. These two techniques are selected because they… ▽ More Accident detection is a vital part of traffic safety. Many road users suffer from traffic accidents, as well as their consequences such as delay, congestion, air pollution, and so on. In this study, we utilize two advanced deep learning techniques, Long Short-Term Memory (LSTM) and Gated Recurrent Units (GRUs), to detect traffic accidents in Chicago. These two techniques are selected because they are known to perform well with sequential data (i.e., time series). The full dataset consists of 241 accident and 6,038 non-accident cases selected from Chicago expressway, and it includes traffic spatiotemporal data, weather condition data, and congestion status data. Moreover, because the dataset is imbalanced (i.e., the dataset contains many more non-accident cases than accident cases), Synthetic Minority Over-sampling Technique (SMOTE) is employed. Overall, the two models perform significantly well, both with an Area Under Curve (AUC) of 0.85. Nonetheless, the GRU model is observed to perform slightly better than LSTM model with respect to detection rate. The performance of both models is similar in terms of false alarm rate. △ Less

Submitted 22 December, 2019; v1 submitted 15 December, 2019; originally announced December 2019.

Comments: 13 pages, 4 figures,2 tables

arXiv:1911.11845 [pdf]

doi 10.1016/j.cmpb.2019.105244

Fast computation of soft tissue thermal response under deformation based on fast explicit dynamics finite element algorithm for surgical simulation

Authors: Jinao Zhang, Sunita Chauhan

Abstract: During thermal heating surgical procedures such as electrosurgery, thermal ablative treatment and hyperthermia, soft tissue deformation due to tool-tissue interaction and patients' motion can affect the distribution of induced thermal energy. Tissue temperature must be efficiently and accurately obtained from deformed tissues for precise thermal energy delivery; however, the classical Pennes bio-h… ▽ More During thermal heating surgical procedures such as electrosurgery, thermal ablative treatment and hyperthermia, soft tissue deformation due to tool-tissue interaction and patients' motion can affect the distribution of induced thermal energy. Tissue temperature must be efficiently and accurately obtained from deformed tissues for precise thermal energy delivery; however, the classical Pennes bio-heat transfer can handle only the static non-moving state of soft tissue. This paper presents a formulation of bio-heat transfer under the effect of tissue deformation for fast or near real-time tissue temperature computation, based on fast explicit dynamics finite element algorithm for transient heat transfer. The proposed computation is achieved by transformation of the unknown deformed tissue state to the known initial non-moving state via a mapping function. The appropriateness and effectiveness of the proposed methodology are evaluated on a realistic virtual human liver with blood vessels to demonstrate a clinically relevant scenario of thermal ablation of hepatic cancer. Compared against the established non-linear procedures from commercial finite element analysis package, ABAQUS/CAE, the proposed methodology can achieve a typical 1.0e-3 level of normalized relative error at nodes and between 1.0e-4 and 1.0e-5 level of total errors, which is in good agreement with ABAQUS solutions. The proposed method consumes slightly more time than the formulation without soft tissue deformation, and computation performance of five different formulations are examined. The proposed method can be applied with bio-mechanical deformable models for fast or near real-time computation of non-linear bio-heat transfer, leading to translational potential in dynamic tissue temperature predictive analysis and thermal dosimetry computation for computer-integrated medical education and personalized treatments. △ Less

Submitted 30 December, 2021; v1 submitted 26 November, 2019; originally announced November 2019.

Comments: Accepted for publication in Computer Methods and Programs in Biomedicine

Journal ref: Computer Methods and Programs in Biomedicine, vol. 187, pp. 105244, 2020

arXiv:1909.05098 [pdf]

doi 10.1080/10407790.2019.1627812

Real-time computation of bio-heat transfer in the fast explicit dynamics finite element algorithm (FED-FEM) framework

Authors: Jinao Zhang, Sunita Chauhan

Abstract: Real-time analysis of bio-heat transfer is very beneficial in improving clinical outcomes of hyperthermia and thermal ablative treatments but challenging to achieve due to large computational costs. This paper presents a fast numerical algorithm well suited for real-time solutions of bio-heat transfer, and it achieves real-time computation via the (i) computationally efficient explicit dynamics in… ▽ More Real-time analysis of bio-heat transfer is very beneficial in improving clinical outcomes of hyperthermia and thermal ablative treatments but challenging to achieve due to large computational costs. This paper presents a fast numerical algorithm well suited for real-time solutions of bio-heat transfer, and it achieves real-time computation via the (i) computationally efficient explicit dynamics in the temporal domain, (ii) element-level thermal load computation, (iii) computationally efficient finite elements, (iv) explicit formulation for unknown nodal temperature, and (v) pre-computation of constant simulation matrices and parameters, all of which lead to a significant reduction in computation time for fast run-time computation. The proposed methodology considers temperature-dependent thermal properties for nonlinear characteristics of bio-heat transfer in soft tissue. Utilising a parallel execution, the proposed method achieves computation time reduction of 107.71 and 274.57 times compared to those of with and without parallelisation of the commercial finite element codes if temperature-dependent thermal properties are considered, and 303.07 and 772.58 times if temperature-independent thermal properties are considered, far exceeding the computational performance of the commercial finite element codes, presenting great potential in real-time predictive analysis of tissue temperature for planning, optimisation and evaluation of thermo-therapeutic treatments. The source code is available at https://github.com/jinaojakezhang/FEDFEMBioheat. △ Less

Submitted 29 December, 2021; v1 submitted 7 September, 2019; originally announced September 2019.

Comments: Published in Numerical Heat Transfer, Part B: Fundamentals. arXiv admin note: text overlap with arXiv:1909.03355

Journal ref: Numerical Heat Transfer, Part B: Fundamentals, vol. 75, no. 4, pp. 217-238, 2019

arXiv:1909.03355 [pdf]

doi 10.1016/j.ijthermalsci.2019.01.030

Fast explicit dynamics finite element algorithm for transient heat transfer

Authors: Jinao Zhang, Sunita Chauhan

Abstract: This paper presents a novel methodology for fast simulation and analysis of transient heat transfer. The proposed methodology is suitable for real-time applications owing to (i) establishing the solution method from the viewpoint of computationally efficient explicit dynamics, (ii) proposing an element-level thermal load computation to eliminate the need for assembling global thermal stiffness, le… ▽ More This paper presents a novel methodology for fast simulation and analysis of transient heat transfer. The proposed methodology is suitable for real-time applications owing to (i) establishing the solution method from the viewpoint of computationally efficient explicit dynamics, (ii) proposing an element-level thermal load computation to eliminate the need for assembling global thermal stiffness, leading to (iii) an explicit formulation of nodal temperature computation to eliminate the need for iterations anywhere in the algorithm, (iv) pre-computing the constant matrices and simulation parameters to facilitate online calculation, and (v) utilising computationally efficient finite elements to efficiently obtain thermal responses in the spatial domain, all of which lead to a significant reduction in computation time for fast run-time simulation. The proposed fast explicit dynamics finite element algorithm (FED-FEM) employs nonlinear thermal material properties, such as temperature-dependent thermal conductivity and specific heat capacity, and nonlinear thermal boundary conditions, such as heat convection and radiation, to account for nonlinear characteristics of transient heat transfer. Simulations and comparison analyses demonstrate that not only can the proposed methodology handle isotropic, orthotropic, anisotropic and temperature-dependent thermal properties but also satisfy the standard patch tests and achieve good agreement with those of the commercial finite element analysis packages for numerical accuracy, for 3-D heat conduction, convection, radiation, and thermal gradient concentration problems. Furthermore, the proposed FED-FEM algorithm is computationally efficient and only consumes a small computation time, capable of achieving real-time computational performance, leading to a novel methodology suitable for real-time simulation and analysis of transient heat transfer. △ Less

Submitted 28 December, 2021; v1 submitted 7 September, 2019; originally announced September 2019.

Comments: Published in International Journal of Thermal Sciences

Journal ref: International Journal of Thermal Sciences, vol. 139, pp. 160-175, 2019

arXiv:1905.05812 [pdf, other]

Multi-task Learning for Multi-modal Emotion Recognition and Sentiment Analysis

Authors: Md Shad Akhtar, Dushyant Singh Chauhan, Deepanway Ghosal, Soujanya Poria, Asif Ekbal, Pushpak Bhattacharyya

Abstract: Related tasks often have inter-dependence on each other and perform better when solved in a joint framework. In this paper, we present a deep multi-task learning framework that jointly performs sentiment and emotion analysis both. The multi-modal inputs (i.e., text, acoustic and visual frames) of a video convey diverse and distinctive information, and usually do not have equal contribution in the… ▽ More Related tasks often have inter-dependence on each other and perform better when solved in a joint framework. In this paper, we present a deep multi-task learning framework that jointly performs sentiment and emotion analysis both. The multi-modal inputs (i.e., text, acoustic and visual frames) of a video convey diverse and distinctive information, and usually do not have equal contribution in the decision making. We propose a context-level inter-modal attention framework for simultaneously predicting the sentiment and expressed emotions of an utterance. We evaluate our proposed approach on CMU-MOSEI dataset for multi-modal sentiment and emotion analysis. Evaluation results suggest that multi-task learning framework offers improvement over the single-task framework. The proposed approach reports new state-of-the-art performance for both sentiment analysis and emotion analysis. △ Less

Submitted 14 May, 2019; originally announced May 2019.

Comments: Accepted for publication in NAACL:HLT-2019

arXiv:1901.06358 [pdf, other]

Embedded CNN based vehicle classification and counting in non-laned road traffic

Authors: Mayank Singh Chauhan, Arshdeep Singh, Mansi Khemka, Arneish Prateek, Rijurekha Sen

Abstract: Classifying and counting vehicles in road traffic has numerous applications in the transportation engineering domain. However, the wide variety of vehicles (two-wheelers, three-wheelers, cars, buses, trucks etc.) plying on roads of developing regions without any lane discipline, makes vehicle classification and counting a hard problem to automate. In this paper, we use state of the art Convolution… ▽ More Classifying and counting vehicles in road traffic has numerous applications in the transportation engineering domain. However, the wide variety of vehicles (two-wheelers, three-wheelers, cars, buses, trucks etc.) plying on roads of developing regions without any lane discipline, makes vehicle classification and counting a hard problem to automate. In this paper, we use state of the art Convolutional Neural Network (CNN) based object detection models and train them for multiple vehicle classes using data from Delhi roads. We get upto 75% MAP on an 80-20 train-test split using 5562 video frames from four different locations. As robust network connectivity is scarce in developing regions for continuous video transmissions from the road to cloud servers, we also evaluate the latency, energy and hardware cost of embedded implementations of our CNN model based inferences. △ Less

Submitted 18 January, 2019; originally announced January 2019.

Comments: *These authors contributed equally

arXiv:1809.06032 [pdf, ps, other]

Hybrid Block Diagonalization for Massive MIMO Two-Way Half-Duplex AF Hybrid Relay

Authors: Arpita Singh Chauhan, Ekant Sharma, Rohit Budhiraja

Abstract: We consider a multi-pair two-way amplify-and-forward massive multi-input multi-output (MIMO) hybrid relay with MIMO user-pairs. A hybrid relay has lesser number of radio frequency (RF) chains than the antennas, which significantly reduces the implementation cost. We employ block-diagonalization-based baseband processing at the hybrid relay to cancel the inter user-pair interference and equal-gain-… ▽ More We consider a multi-pair two-way amplify-and-forward massive multi-input multi-output (MIMO) hybrid relay with MIMO user-pairs. A hybrid relay has lesser number of radio frequency (RF) chains than the antennas, which significantly reduces the implementation cost. We employ block-diagonalization-based baseband processing at the hybrid relay to cancel the inter user-pair interference and equal-gain-combining-based RF processing to maximize the beamforming gain. We also use an algebraic norm maximizing relay transmit strategy to maximize the spectral efficiency (SE) of each user-pair. We numerically show that the proposed hybrid relay has only marginally inferior SE than a full RF-chain relay. △ Less

Submitted 17 September, 2018; originally announced September 2018.

Comments: Accepted for publication in IEEE SPCOM'18

Showing 1–50 of 70 results for author: Chauhan, S