-
Rethinking Citation of AI Sources in Student-AI Collaboration within HCI Design Education
Authors:
Prakash Shukla,
Suchismita Naik,
Ike Obi,
Jessica Backus,
Nancy Rasche,
Paul Parsons
Abstract:
The growing integration of AI tools in student design projects presents an unresolved challenge in HCI education: how should AI-generated content be cited and documented? Traditional citation frameworks -- grounded in credibility, retrievability, and authorship -- struggle to accommodate the dynamic and ephemeral nature of AI outputs. In this paper, we examine how undergraduate students in a UX de…
▽ More
The growing integration of AI tools in student design projects presents an unresolved challenge in HCI education: how should AI-generated content be cited and documented? Traditional citation frameworks -- grounded in credibility, retrievability, and authorship -- struggle to accommodate the dynamic and ephemeral nature of AI outputs. In this paper, we examine how undergraduate students in a UX design course approached AI usage and citation when given the freedom to integrate generative tools into their design process. Through qualitative analysis of 35 team projects and reflections from 175 students, we identify varied citation practices ranging from formal attribution to indirect or absent acknowledgment. These inconsistencies reveal gaps in existing frameworks and raise questions about authorship, assessment, and pedagogical transparency. We argue for rethinking AI citation as a reflective and pedagogical practice; one that supports metacognitive engagement by prompting students to critically evaluate how and why they used AI throughout the design process. We propose alternative strategies -- such as AI contribution statements and process-aware citation models that better align with the iterative and reflective nature of design education. This work invites educators to reconsider how citation practices can support meaningful student--AI collaboration.
△ Less
Submitted 13 June, 2025; v1 submitted 10 June, 2025;
originally announced June 2025.
-
Tracing the Invisible: Understanding Students' Judgment in AI-Supported Design Work
Authors:
Suchismita Naik,
Prakash Shukla,
Ike Obi,
Jessica Backus,
Nancy Rasche,
Paul Parsons
Abstract:
As generative AI tools become integrated into design workflows, students increasingly engage with these tools not just as aids, but as collaborators. This study analyzes reflections from 33 student teams in an HCI design course to examine the kinds of judgments students make when using AI tools. We found both established forms of design judgment (e.g., instrumental, appreciative, quality) and emer…
▽ More
As generative AI tools become integrated into design workflows, students increasingly engage with these tools not just as aids, but as collaborators. This study analyzes reflections from 33 student teams in an HCI design course to examine the kinds of judgments students make when using AI tools. We found both established forms of design judgment (e.g., instrumental, appreciative, quality) and emergent types: agency-distribution judgment and reliability judgment. These new forms capture how students negotiate creative responsibility with AI and assess the trustworthiness of its outputs. Our findings suggest that generative AI introduces new layers of complexity into design reasoning, prompting students to reflect not only on what AI produces, but also on how and when to rely on it. By foregrounding these judgments, we offer a conceptual lens for understanding how students engage in co-creative sensemaking with AI in design contexts.
△ Less
Submitted 13 May, 2025;
originally announced May 2025.
-
Thin-Shell-SfT: Fine-Grained Monocular Non-rigid 3D Surface Tracking with Neural Deformation Fields
Authors:
Navami Kairanda,
Marc Habermann,
Shanthika Naik,
Christian Theobalt,
Vladislav Golyanik
Abstract:
3D reconstruction of highly deformable surfaces (e.g. cloths) from monocular RGB videos is a challenging problem, and no solution provides a consistent and accurate recovery of fine-grained surface details. To account for the ill-posed nature of the setting, existing methods use deformation models with statistical, neural, or physical priors. They also predominantly rely on nonadaptive discrete su…
▽ More
3D reconstruction of highly deformable surfaces (e.g. cloths) from monocular RGB videos is a challenging problem, and no solution provides a consistent and accurate recovery of fine-grained surface details. To account for the ill-posed nature of the setting, existing methods use deformation models with statistical, neural, or physical priors. They also predominantly rely on nonadaptive discrete surface representations (e.g. polygonal meshes), perform frame-by-frame optimisation leading to error propagation, and suffer from poor gradients of the mesh-based differentiable renderers. Consequently, fine surface details such as cloth wrinkles are often not recovered with the desired accuracy. In response to these limitations, we propose ThinShell-SfT, a new method for non-rigid 3D tracking that represents a surface as an implicit and continuous spatiotemporal neural field. We incorporate continuous thin shell physics prior based on the Kirchhoff-Love model for spatial regularisation, which starkly contrasts the discretised alternatives of earlier works. Lastly, we leverage 3D Gaussian splatting to differentiably render the surface into image space and optimise the deformations based on analysis-bysynthesis principles. Our Thin-Shell-SfT outperforms prior works qualitatively and quantitatively thanks to our continuous surface formulation in conjunction with a specially tailored simulation prior and surface-induced 3D Gaussians. See our project page at https://4dqv.mpiinf.mpg.de/ThinShellSfT.
△ Less
Submitted 25 March, 2025;
originally announced March 2025.
-
Multi-View Transformers for Airway-To-Lung Ratio Inference on Cardiac CT Scans: The C4R Study
Authors:
Sneha N. Naik,
Elsa D. Angelini,
Eric A. Hoffman,
Elizabeth C. Oelsner,
R. Graham Barr,
Benjamin M. Smith,
Andrew F. Laine
Abstract:
The ratio of airway tree lumen to lung size (ALR), assessed at full inspiration on high resolution full-lung computed tomography (CT), is a major risk factor for chronic obstructive pulmonary disease (COPD). There is growing interest to infer ALR from cardiac CT images, which are widely available in epidemiological cohorts, to investigate the relationship of ALR to severe COVID-19 and post-acute s…
▽ More
The ratio of airway tree lumen to lung size (ALR), assessed at full inspiration on high resolution full-lung computed tomography (CT), is a major risk factor for chronic obstructive pulmonary disease (COPD). There is growing interest to infer ALR from cardiac CT images, which are widely available in epidemiological cohorts, to investigate the relationship of ALR to severe COVID-19 and post-acute sequelae of SARS-CoV-2 infection (PASC). Previously, cardiac scans included approximately 2/3 of the total lung volume with 5-6x greater slice thickness than high-resolution (HR) full-lung (FL) CT. In this study, we present a novel attention-based Multi-view Swin Transformer to infer FL ALR values from segmented cardiac CT scans. For the supervised training we exploit paired full-lung and cardiac CTs acquired in the Multi-Ethnic Study of Atherosclerosis (MESA). Our network significantly outperforms a proxy direct ALR inference on segmented cardiac CT scans and achieves accuracy and reproducibility comparable with a scan-rescan reproducibility of the FL ALR ground-truth.
△ Less
Submitted 15 January, 2025;
originally announced January 2025.
-
LMUnit: Fine-grained Evaluation with Natural Language Unit Tests
Authors:
Jon Saad-Falcon,
Rajan Vivek,
William Berrios,
Nandita Shankar Naik,
Matija Franklin,
Bertie Vidgen,
Amanpreet Singh,
Douwe Kiela,
Shikib Mehri
Abstract:
As language models become integral to critical workflows, assessing their behavior remains a fundamental challenge -- human evaluation is costly and noisy, while automated metrics provide only coarse, difficult-to-interpret signals. We introduce natural language unit tests, a paradigm that decomposes response quality into explicit, testable criteria, along with a unified scoring model, LMUnit, whi…
▽ More
As language models become integral to critical workflows, assessing their behavior remains a fundamental challenge -- human evaluation is costly and noisy, while automated metrics provide only coarse, difficult-to-interpret signals. We introduce natural language unit tests, a paradigm that decomposes response quality into explicit, testable criteria, along with a unified scoring model, LMUnit, which combines multi-objective training across preferences, direct ratings, and natural language rationales. Through controlled human studies, we show this paradigm significantly improves inter-annotator agreement and enables more effective LLM development workflows. LMUnit achieves state-of-the-art performance on evaluation benchmarks (FLASK, BigGenBench) and competitive results on RewardBench. These results validate both our proposed paradigm and scoring model, suggesting a promising path forward for language model evaluation and development.
△ Less
Submitted 17 December, 2024;
originally announced December 2024.
-
From Lived Experience to Insight: Unpacking the Psychological Risks of Using AI Conversational Agents
Authors:
Mohit Chandra,
Suchismita Naik,
Denae Ford,
Ebele Okoli,
Munmun De Choudhury,
Mahsa Ershadi,
Gonzalo Ramos,
Javier Hernandez,
Ananya Bhattacharjee,
Shahed Warreth,
Jina Suh
Abstract:
Recent gains in popularity of AI conversational agents have led to their increased use for improving productivity and supporting well-being. While previous research has aimed to understand the risks associated with interactions with AI conversational agents, these studies often fall short in capturing the lived experiences of individuals. Additionally, psychological risks have often been presented…
▽ More
Recent gains in popularity of AI conversational agents have led to their increased use for improving productivity and supporting well-being. While previous research has aimed to understand the risks associated with interactions with AI conversational agents, these studies often fall short in capturing the lived experiences of individuals. Additionally, psychological risks have often been presented as a sub-category within broader AI-related risks in past taxonomy works, leading to under-representation of the impact of psychological risks of AI use. To address these challenges, our work presents a novel risk taxonomy focusing on psychological risks of using AI gathered through the lived experiences of individuals. We employed a mixed-method approach, involving a comprehensive survey with 283 people with lived mental health experience and workshops involving experts with lived experience to develop a psychological risk taxonomy. Our taxonomy features 19 AI behaviors, 21 negative psychological impacts, and 15 contexts related to individuals. Additionally, we propose a novel multi-path vignette-based framework for understanding the complex interplay between AI behaviors, psychological impacts, and individual user contexts. Finally, based on the feedback obtained from the workshop sessions, we present design recommendations for developing safer and more robust AI agents. Our work offers an in-depth understanding of the psychological risks associated with AI conversational agents and provides actionable recommendations for policymakers, researchers, and developers.
△ Less
Submitted 29 May, 2025; v1 submitted 10 December, 2024;
originally announced December 2024.
-
Maximal Extractable Value in Decentralized Finance: Taxonomy, Detection, and Mitigation
Authors:
Huned Materwala,
Shraddha M. Naik,
Aya Taha,
Tala Abdulrahman Abed,
Davor Svetinovic
Abstract:
Decentralized Finance (DeFi) leverages blockchain-enabled smart contracts to deliver automated and trustless financial services without the need for intermediaries. However, the public visibility of financial transactions on the blockchain can be exploited, as participants can reorder, insert, or remove transactions to extract value, often at the expense of others. This extracted value is known as…
▽ More
Decentralized Finance (DeFi) leverages blockchain-enabled smart contracts to deliver automated and trustless financial services without the need for intermediaries. However, the public visibility of financial transactions on the blockchain can be exploited, as participants can reorder, insert, or remove transactions to extract value, often at the expense of others. This extracted value is known as the Maximal Extractable Value (MEV). MEV causes financial losses and consensus instability, disrupting the security, efficiency, and decentralization goals of the DeFi ecosystem. Therefore, it is crucial to analyze, detect, and mitigate MEV to safeguard DeFi. Our comprehensive survey offers a holistic view of the MEV landscape in the DeFi ecosystem. We present an in-depth understanding of MEV through a novel taxonomy of MEV transactions supported by real transaction examples. We perform a critical comparative analysis of various MEV detection approaches, evaluating their effectiveness in identifying different transaction types. Furthermore, we assess different categories of MEV mitigation strategies and discuss their limitations. We identify the challenges of current mitigation and detection approaches and discuss potential solutions. This survey provides valuable insights for researchers, developers, stakeholders, and policymakers, helping to curb and democratize MEV for a more secure and efficient DeFi ecosystem.
△ Less
Submitted 28 February, 2025; v1 submitted 22 October, 2024;
originally announced November 2024.
-
A Compounded Burr Probability Distribution for Fitting Heavy-Tailed Data with Applications to Biological Networks
Authors:
Tanujit Chakraborty,
Swarup Chattopadhyay,
Suchismita Das,
Shraddha M. Naik,
Chittaranjan Hens
Abstract:
Complex biological networks, encompassing metabolic pathways, gene regulatory systems, and protein-protein interaction networks, often exhibit scale-free structures characterized by heavy-tailed degree distributions. However, empirical studies reveal significant deviations from ideal power law behavior, underscoring the need for more flexible and accurate probabilistic models. In this work, we pro…
▽ More
Complex biological networks, encompassing metabolic pathways, gene regulatory systems, and protein-protein interaction networks, often exhibit scale-free structures characterized by heavy-tailed degree distributions. However, empirical studies reveal significant deviations from ideal power law behavior, underscoring the need for more flexible and accurate probabilistic models. In this work, we propose the Compounded Burr (CBurr) distribution, a novel four parameter family derived by compounding the Burr distribution with a discrete mixing process. This model is specifically designed to capture both the body and tail behavior of real-world network degree distributions with applications to biological networks. We rigorously derive its statistical properties, including moments, hazard and risk functions, and tail behavior, and develop an efficient maximum likelihood estimation framework. The CBurr model demonstrates broad applicability to networks with complex connectivity patterns, particularly in biological, social, and technological domains. Extensive experiments on large-scale biological network datasets show that CBurr consistently outperforms classical power-law, log-normal, and other heavy-tailed models across the full degree spectrum. By providing a statistically grounded and interpretable framework, the CBurr model enhances our ability to characterize the structural heterogeneity of biological networks.
△ Less
Submitted 26 April, 2025; v1 submitted 5 July, 2024;
originally announced July 2024.
-
CommVQA: Situating Visual Question Answering in Communicative Contexts
Authors:
Nandita Shankar Naik,
Christopher Potts,
Elisa Kreiss
Abstract:
Current visual question answering (VQA) models tend to be trained and evaluated on image-question pairs in isolation. However, the questions people ask are dependent on their informational needs and prior knowledge about the image content. To evaluate how situating images within naturalistic contexts shapes visual questions, we introduce CommVQA, a VQA dataset consisting of images, image descripti…
▽ More
Current visual question answering (VQA) models tend to be trained and evaluated on image-question pairs in isolation. However, the questions people ask are dependent on their informational needs and prior knowledge about the image content. To evaluate how situating images within naturalistic contexts shapes visual questions, we introduce CommVQA, a VQA dataset consisting of images, image descriptions, real-world communicative scenarios where the image might appear (e.g., a travel website), and follow-up questions and answers conditioned on the scenario and description. CommVQA, which contains 1000 images and 8,949 question-answer pairs, poses a challenge for current models. Error analyses and a human-subjects study suggest that generated answers still contain high rates of hallucinations, fail to fittingly address unanswerable questions, and don't suitably reflect contextual information. Overall, we show that access to contextual information is essential for solving CommVQA, leading to the highest performing VQA model and highlighting the relevance of situating systems within communicative scenarios.
△ Less
Submitted 3 October, 2024; v1 submitted 22 February, 2024;
originally announced February 2024.
-
Dress-Me-Up: A Dataset & Method for Self-Supervised 3D Garment Retargeting
Authors:
Shanthika Naik,
Kunwar Singh,
Astitva Srivastava,
Dhawal Sirikonda,
Amit Raj,
Varun Jampani,
Avinash Sharma
Abstract:
We propose a novel self-supervised framework for retargeting non-parameterized 3D garments onto 3D human avatars of arbitrary shapes and poses, enabling 3D virtual try-on (VTON). Existing self-supervised 3D retargeting methods only support parametric and canonical garments, which can only be draped over parametric body, e.g. SMPL. To facilitate the non-parametric garments and body, we propose a no…
▽ More
We propose a novel self-supervised framework for retargeting non-parameterized 3D garments onto 3D human avatars of arbitrary shapes and poses, enabling 3D virtual try-on (VTON). Existing self-supervised 3D retargeting methods only support parametric and canonical garments, which can only be draped over parametric body, e.g. SMPL. To facilitate the non-parametric garments and body, we propose a novel method that introduces Isomap Embedding based correspondences matching between the garment and the human body to get a coarse alignment between the two meshes. We perform neural refinement of the coarse alignment in a self-supervised setting. Further, we leverage a Laplacian detail integration method for preserving the inherent details of the input garment. For evaluating our 3D non-parametric garment retargeting framework, we propose a dataset of 255 real-world garments with realistic noise and topological deformations. The dataset contains $44$ unique garments worn by 15 different subjects in 5 distinctive poses, captured using a multi-view RGBD capture setup. We show superior retargeting quality on non-parametric garments and human avatars over existing state-of-the-art methods, acting as the first-ever baseline on the proposed dataset for non-parametric 3D garment retargeting.
△ Less
Submitted 5 January, 2024;
originally announced January 2024.
-
Skew-Probabilistic Neural Networks for Learning from Imbalanced Data
Authors:
Shraddha M. Naik,
Tanujit Chakraborty,
Madhurima Panja,
Abdenour Hadid,
Bibhas Chakraborty
Abstract:
Real-world datasets often exhibit imbalanced data distribution, where certain class levels are severely underrepresented. In such cases, traditional pattern classifiers have shown a bias towards the majority class, impeding accurate predictions for the minority class. This paper introduces an imbalanced data-oriented classifier using probabilistic neural networks (PNN) with a skew-normal kernel fu…
▽ More
Real-world datasets often exhibit imbalanced data distribution, where certain class levels are severely underrepresented. In such cases, traditional pattern classifiers have shown a bias towards the majority class, impeding accurate predictions for the minority class. This paper introduces an imbalanced data-oriented classifier using probabilistic neural networks (PNN) with a skew-normal kernel function to address this major challenge. PNN is known for providing probabilistic outputs, enabling quantification of prediction confidence, interpretability, and the ability to handle limited data. By leveraging the skew-normal distribution, which offers increased flexibility, particularly for imbalanced and non-symmetric data, our proposed Skew-Probabilistic Neural Networks (SkewPNN) can better represent underlying class densities. Hyperparameter fine-tuning is imperative to optimize the performance of the proposed approach on imbalanced datasets. To this end, we employ a population-based heuristic algorithm, the Bat optimization algorithm, to explore the hyperparameter space effectively. We also prove the statistical consistency of the density estimates, suggesting that the true distribution will be approached smoothly as the sample size increases. Theoretical analysis of the computational complexity of the proposed SkewPNN and BA-SkewPNN is also provided. Numerical simulations have been conducted on different synthetic datasets, comparing various benchmark-imbalanced learners. Real-data analysis on several datasets shows that SkewPNN and BA-SkewPNN substantially outperform most state-of-the-art machine-learning methods for both balanced and imbalanced datasets (binary and multi-class categories) in most experimental settings.
△ Less
Submitted 1 December, 2024; v1 submitted 10 December, 2023;
originally announced December 2023.
-
Ten Years of Generative Adversarial Nets (GANs): A survey of the state-of-the-art
Authors:
Tanujit Chakraborty,
Ujjwal Reddy K S,
Shraddha M. Naik,
Madhurima Panja,
Bayapureddy Manvitha
Abstract:
Since their inception in 2014, Generative Adversarial Networks (GANs) have rapidly emerged as powerful tools for generating realistic and diverse data across various domains, including computer vision and other applied areas. Consisting of a discriminative network and a generative network engaged in a Minimax game, GANs have revolutionized the field of generative modeling. In February 2018, GAN se…
▽ More
Since their inception in 2014, Generative Adversarial Networks (GANs) have rapidly emerged as powerful tools for generating realistic and diverse data across various domains, including computer vision and other applied areas. Consisting of a discriminative network and a generative network engaged in a Minimax game, GANs have revolutionized the field of generative modeling. In February 2018, GAN secured the leading spot on the ``Top Ten Global Breakthrough Technologies List'' issued by the Massachusetts Science and Technology Review. Over the years, numerous advancements have been proposed, leading to a rich array of GAN variants, such as conditional GAN, Wasserstein GAN, CycleGAN, and StyleGAN, among many others. This survey aims to provide a general overview of GANs, summarizing the latent architecture, validation metrics, and application areas of the most widely recognized variants. We also delve into recent theoretical developments, exploring the profound connection between the adversarial principle underlying GAN and Jensen-Shannon divergence, while discussing the optimality characteristics of the GAN framework. The efficiency of GAN variants and their model architectures will be evaluated along with training obstacles as well as training solutions. In addition, a detailed discussion will be provided, examining the integration of GANs with newly developed deep learning frameworks such as Transformers, Physics-Informed Neural Networks, Large Language models, and Diffusion models. Finally, we reveal several issues as well as future research outlines in this field.
△ Less
Submitted 30 August, 2023;
originally announced August 2023.
-
Reinforcing Security and Usability of Crypto-Wallet with Post-Quantum Cryptography and Zero-Knowledge Proof
Authors:
Yathin Kethepalli,
Rony Joseph,
Sai Raja Vajrala,
Jashwanth Vemula,
Nenavath Srinivas Naik
Abstract:
Crypto-wallets or digital asset wallets are a crucial aspect of managing cryptocurrencies and other digital assets such as NFTs. However, these wallets are not immune to security threats, particularly from the growing risk of quantum computing. The use of traditional public-key cryptography systems in digital asset wallets makes them vulnerable to attacks from quantum computers, which may increase…
▽ More
Crypto-wallets or digital asset wallets are a crucial aspect of managing cryptocurrencies and other digital assets such as NFTs. However, these wallets are not immune to security threats, particularly from the growing risk of quantum computing. The use of traditional public-key cryptography systems in digital asset wallets makes them vulnerable to attacks from quantum computers, which may increase in the future. Moreover, current digital wallets require users to keep track of seed-phrases, which can be challenging and lead to additional security risks. To overcome these challenges, a new algorithm is proposed that uses post-quantum cryptography (PQC) and zero-knowledge proof (ZKP) to enhance the security of digital asset wallets. The research focuses on the use of the Lattice-based Threshold Secret Sharing Scheme (LTSSS), Kyber Algorithm for key generation and ZKP for wallet unlocking, providing a more secure and user-friendly alternative to seed-phrase, brain and multi-sig protocol wallets. This algorithm also includes several innovative security features such as recovery of wallets in case of downtime of the server, and the ability to rekey the private key associated with a specific username-password combination, offering improved security and usability. The incorporation of PQC and ZKP provides a robust and comprehensive framework for securing digital assets in the present and future. This research aims to address the security challenges faced by digital asset wallets and proposes practical solutions to ensure their safety in the era of quantum computing.
△ Less
Submitted 29 August, 2023; v1 submitted 14 August, 2023;
originally announced August 2023.
-
Selective Pre-training for Private Fine-tuning
Authors:
Da Yu,
Sivakanth Gopi,
Janardhan Kulkarni,
Zinan Lin,
Saurabh Naik,
Tomasz Lukasz Religa,
Jian Yin,
Huishuai Zhang
Abstract:
Text prediction models, when used in applications like email clients or word processors, must protect user data privacy and adhere to model size constraints. These constraints are crucial to meet memory and inference time requirements, as well as to reduce inference costs. Building small, fast, and private domain-specific language models is a thriving area of research. In this work, we show that a…
▽ More
Text prediction models, when used in applications like email clients or word processors, must protect user data privacy and adhere to model size constraints. These constraints are crucial to meet memory and inference time requirements, as well as to reduce inference costs. Building small, fast, and private domain-specific language models is a thriving area of research. In this work, we show that a careful pre-training on a \emph{subset} of the public dataset that is guided by the private dataset is crucial to train small language models with differential privacy. On standard benchmarks, small models trained with our new framework achieve state-of-the-art performance. In addition to performance improvements, our results demonstrate that smaller models, through careful pre-training and private fine-tuning, can match the performance of much larger models that do not have access to private data. This underscores the potential of private learning for model compression and enhanced efficiency.
△ Less
Submitted 2 July, 2024; v1 submitted 23 May, 2023;
originally announced May 2023.
-
Estimating related words computationally using language model from the Mahabharata -- an Indian epic
Authors:
Vrunda Gadesha,
Keyur D Joshi,
Shefali Naik
Abstract:
'Mahabharata' is the most popular among many Indian pieces of literature referred to in many domains for completely different purposes. This text itself is having various dimension and aspects which is useful for the human being in their personal life and professional life. This Indian Epic is originally written in the Sanskrit Language. Now in the era of Natural Language Processing, Artificial In…
▽ More
'Mahabharata' is the most popular among many Indian pieces of literature referred to in many domains for completely different purposes. This text itself is having various dimension and aspects which is useful for the human being in their personal life and professional life. This Indian Epic is originally written in the Sanskrit Language. Now in the era of Natural Language Processing, Artificial Intelligence, Machine Learning, and Human-Computer interaction this text can be processed according to the domain requirement. It is interesting to process this text and get useful insights from Mahabharata. The limitation of the humans while analyzing Mahabharata is that they always have a sentiment aspect towards the story narrated by the author. Apart from that, the human cannot memorize statistical or computational details, like which two words are frequently coming in one sentence? What is the average length of the sentences across the whole literature? Which word is the most popular word across the text, what are the lemmas of the words used across the sentences? Thus, in this paper, we propose an NLP pipeline to get some statistical and computational insights along with the most relevant word searching method from the largest epic 'Mahabharata'. We stacked the different text-processing approaches to articulate the best results which can be further used in the various domain where Mahabharata needs to be referred.
△ Less
Submitted 9 May, 2023;
originally announced May 2023.
-
Planting and Mitigating Memorized Content in Predictive-Text Language Models
Authors:
C. M. Downey,
Wei Dai,
Huseyin A. Inan,
Kim Laine,
Saurabh Naik,
Tomasz Religa
Abstract:
Language models are widely deployed to provide automatic text completion services in user products. However, recent research has revealed that language models (especially large ones) bear considerable risk of memorizing private training data, which is then vulnerable to leakage and extraction by adversaries. In this study, we test the efficacy of a range of privacy-preserving techniques to mitigat…
▽ More
Language models are widely deployed to provide automatic text completion services in user products. However, recent research has revealed that language models (especially large ones) bear considerable risk of memorizing private training data, which is then vulnerable to leakage and extraction by adversaries. In this study, we test the efficacy of a range of privacy-preserving techniques to mitigate unintended memorization of sensitive user text, while varying other factors such as model size and adversarial conditions. We test both "heuristic" mitigations (those without formal privacy guarantees) and Differentially Private training, which provides provable levels of privacy at the cost of some model performance. Our experiments show that (with the exception of L2 regularization), heuristic mitigations are largely ineffective in preventing memorization in our test suite, possibly because they make too strong of assumptions about the characteristics that define "sensitive" or "private" text. In contrast, Differential Privacy reliably prevents memorization in our experiments, despite its computational and model-performance costs.
△ Less
Submitted 16 December, 2022;
originally announced December 2022.
-
Athletic Mobile Manipulator System for Robotic Wheelchair Tennis
Authors:
Zulfiqar Zaidi,
Daniel Martin,
Nathaniel Belles,
Viacheslav Zakharov,
Arjun Krishna,
Kin Man Lee,
Peter Wagstaff,
Sumedh Naik,
Matthew Sklar,
Sugju Choi,
Yoshiki Kakehi,
Ruturaj Patil,
Divya Mallemadugula,
Florian Pesce,
Peter Wilson,
Wendell Hom,
Matan Diamond,
Bryan Zhao,
Nina Moorman,
Rohan Paleja,
Letian Chen,
Esmaeil Seraj,
Matthew Gombolay
Abstract:
Athletics are a quintessential and universal expression of humanity. From French monks who in the 12th century invented jeu de paume, the precursor to modern lawn tennis, back to the K'iche' people who played the Maya Ballgame as a form of religious expression over three thousand years ago, humans have sought to train their minds and bodies to excel in sporting contests. Advances in robotics are o…
▽ More
Athletics are a quintessential and universal expression of humanity. From French monks who in the 12th century invented jeu de paume, the precursor to modern lawn tennis, back to the K'iche' people who played the Maya Ballgame as a form of religious expression over three thousand years ago, humans have sought to train their minds and bodies to excel in sporting contests. Advances in robotics are opening up the possibility of robots in sports. Yet, key challenges remain, as most prior works in robotics for sports are limited to pristine sensing environments, do not require significant force generation, or are on miniaturized scales unsuited for joint human-robot play. In this paper, we propose the first open-source, autonomous robot for playing regulation wheelchair tennis. We demonstrate the performance of our full-stack system in executing ground strokes and evaluate each of the system's hardware and software components. The goal of this paper is to (1) inspire more research in human-scale robot athletics and (2) establish the first baseline for a reproducible wheelchair tennis robot for regulation singles play. Our paper contributes to the science of systems design and poses a set of key challenges for the robotics community to address in striving towards robots that can match human capabilities in sports.
△ Less
Submitted 7 February, 2023; v1 submitted 5 October, 2022;
originally announced October 2022.
-
Probing Semantic Grounding in Language Models of Code with Representational Similarity Analysis
Authors:
Shounak Naik,
Rajaswa Patil,
Swati Agarwal,
Veeky Baths
Abstract:
Representational Similarity Analysis is a method from cognitive neuroscience, which helps in comparing representations from two different sources of data. In this paper, we propose using Representational Similarity Analysis to probe the semantic grounding in language models of code. We probe representations from the CodeBERT model for semantic grounding by using the data from the IBM CodeNet datas…
▽ More
Representational Similarity Analysis is a method from cognitive neuroscience, which helps in comparing representations from two different sources of data. In this paper, we propose using Representational Similarity Analysis to probe the semantic grounding in language models of code. We probe representations from the CodeBERT model for semantic grounding by using the data from the IBM CodeNet dataset. Through our experiments, we show that current pre-training methods do not induce semantic grounding in language models of code, and instead focus on optimizing form-based patterns. We also show that even a little amount of fine-tuning on semantically relevant tasks increases the semantic grounding in CodeBERT significantly. Our ablations with the input modality to the CodeBERT model show that using bimodal inputs (code and natural language) over unimodal inputs (only code) gives better semantic grounding and sample efficiency during semantic fine-tuning. Finally, our experiments with semantic perturbations in code reveal that CodeBERT is able to robustly distinguish between semantically correct and incorrect code.
△ Less
Submitted 15 July, 2022;
originally announced July 2022.
-
Wireless Self-Powered Visual and NDE low-Cost Inspection System For Small Diameter Live Gas Distribution Mains
Authors:
Shivani Naik,
Arjun Kumar,
Nitinesh Yadav,
K. M. Santosh
Abstract:
The arrangement of an in-pipe climbing robot that works using a sharp transmission part to explore complex relationship of lines. Standard wheeled/continued in-pipe climbing robots are leaned to slip and take while researching in pipe turns. The instrument helps in achieving the really unavoidable consequence of getting out slip and drag in the robot tracks during progression. The proposed transmi…
▽ More
The arrangement of an in-pipe climbing robot that works using a sharp transmission part to explore complex relationship of lines. Standard wheeled/continued in-pipe climbing robots are leaned to slip and take while researching in pipe turns. The instrument helps in achieving the really unavoidable consequence of getting out slip and drag in the robot tracks during progression. The proposed transmission likes the useful uttermost scopes of the standard two-yield transmission, which is fostered the fundamental time for a transmission with three outcomes. The instrument decisively changes the track velocities of the robot considering the powers applied on each track inside the line relationship, by getting out the fundamental for any wonderful control. The entertainment of the robot crossing in the line network in different direction and in pipe-turns without slip shows the proposed course of action's ampleness.
△ Less
Submitted 6 June, 2022;
originally announced June 2022.
-
Deep Generative Framework for Interactive 3D Terrain Authoring and Manipulation
Authors:
Shanthika Naik,
Aryamaan Jain,
Avinash Sharma,
KS Rajan
Abstract:
Automated generation and (user) authoring of the realistic virtual terrain is most sought for by the multimedia applications like VR models and gaming. The most common representation adopted for terrain is Digital Elevation Model (DEM). Existing terrain authoring and modeling techniques have addressed some of these and can be broadly categorized as: procedural modeling, simulation method, and exam…
▽ More
Automated generation and (user) authoring of the realistic virtual terrain is most sought for by the multimedia applications like VR models and gaming. The most common representation adopted for terrain is Digital Elevation Model (DEM). Existing terrain authoring and modeling techniques have addressed some of these and can be broadly categorized as: procedural modeling, simulation method, and example-based methods. In this paper, we propose a novel realistic terrain authoring framework powered by a combination of VAE and generative conditional GAN model. Our framework is an example-based method that attempts to overcome the limitations of existing methods by learning a latent space from a real-world terrain dataset. This latent space allows us to generate multiple variants of terrain from a single input as well as interpolate between terrains while keeping the generated terrains close to real-world data distribution. We also developed an interactive tool, that lets the user generate diverse terrains with minimalist inputs. We perform thorough qualitative and quantitative analysis and provide comparisons with other SOTA methods. We intend to release our code/tool to the academic community.
△ Less
Submitted 7 January, 2022;
originally announced January 2022.
-
Differentially Private Fine-tuning of Language Models
Authors:
Da Yu,
Saurabh Naik,
Arturs Backurs,
Sivakanth Gopi,
Huseyin A. Inan,
Gautam Kamath,
Janardhan Kulkarni,
Yin Tat Lee,
Andre Manoel,
Lukas Wutschitz,
Sergey Yekhanin,
Huishuai Zhang
Abstract:
We give simpler, sparser, and faster algorithms for differentially private fine-tuning of large-scale pre-trained language models, which achieve the state-of-the-art privacy versus utility tradeoffs on many standard NLP tasks. We propose a meta-framework for this problem, inspired by the recent success of highly parameter-efficient methods for fine-tuning. Our experiments show that differentially…
▽ More
We give simpler, sparser, and faster algorithms for differentially private fine-tuning of large-scale pre-trained language models, which achieve the state-of-the-art privacy versus utility tradeoffs on many standard NLP tasks. We propose a meta-framework for this problem, inspired by the recent success of highly parameter-efficient methods for fine-tuning. Our experiments show that differentially private adaptations of these approaches outperform previous private algorithms in three important dimensions: utility, privacy, and the computational and memory cost of private training. On many commonly studied datasets, the utility of private models approaches that of non-private models. For example, on the MNLI dataset we achieve an accuracy of $87.8\%$ using RoBERTa-Large and $83.5\%$ using RoBERTa-Base with a privacy budget of $ε= 6.7$. In comparison, absent privacy constraints, RoBERTa-Large achieves an accuracy of $90.2\%$. Our findings are similar for natural language generation tasks. Privately fine-tuning with DART, GPT-2-Small, GPT-2-Medium, GPT-2-Large, and GPT-2-XL achieve BLEU scores of 38.5, 42.0, 43.1, and 43.8 respectively (privacy budget of $ε= 6.8,δ=$ 1e-5) whereas the non-private baseline is $48.1$. All our experiments suggest that larger models are better suited for private fine-tuning: while they are well known to achieve superior accuracy non-privately, we find that they also better maintain their accuracy when privacy is introduced.
△ Less
Submitted 14 July, 2022; v1 submitted 13 October, 2021;
originally announced October 2021.
-
Predicting Credit Risk for Unsecured Lending: A Machine Learning Approach
Authors:
K. S. Naik
Abstract:
Since the 1990s, there have been significant advances in the technology space and the e-Commerce area, leading to an exponential increase in demand for cashless payment solutions. This has led to increased demand for credit cards, bringing along with it the possibility of higher credit defaults and hence higher delinquency rates, over a period of time. The purpose of this research paper is to buil…
▽ More
Since the 1990s, there have been significant advances in the technology space and the e-Commerce area, leading to an exponential increase in demand for cashless payment solutions. This has led to increased demand for credit cards, bringing along with it the possibility of higher credit defaults and hence higher delinquency rates, over a period of time. The purpose of this research paper is to build a contemporary credit scoring model to forecast credit defaults for unsecured lending (credit cards), by employing machine learning techniques. As much of the customer payments data available to lenders, for forecasting Credit defaults, is imbalanced (skewed), on account of a limited subset of default instances, this poses a challenge for predictive modelling. In this research, this challenge is addressed by deploying Synthetic Minority Oversampling Technique (SMOTE), a proven technique to iron out such imbalances, from a given dataset. On running the research dataset through seven different machine learning models, the results indicate that the Light Gradient Boosting Machine (LGBM) Classifier model outperforms the other six classification techniques. Thus, our research indicates that the LGBM classifier model is better equipped to deliver higher learning speeds, better efficiencies and manage larger data volumes. We expect that deployment of this model will enable better and timely prediction of credit defaults for decision-makers in commercial lending institutions and banks.
△ Less
Submitted 5 October, 2021;
originally announced October 2021.
-
Traffic control Management System and Collision Avoidance System
Authors:
Gangadhar,
Parimala Prabhakar,
Abhishek S,
Prajwal,
Suraj Naik
Abstract:
Many road accidents occur due to drivers failing to read sign board due to various reasons. Especially at night, the tiredness of driver reduces his perception to small things like speed limit of sign the board, curve ahead sign board. For the smooth movement of ambulance in cities during traffic, is to create an IOT device to detect sign boards and also able to com-municate with the traffic light…
▽ More
Many road accidents occur due to drivers failing to read sign board due to various reasons. Especially at night, the tiredness of driver reduces his perception to small things like speed limit of sign the board, curve ahead sign board. For the smooth movement of ambulance in cities during traffic, is to create an IOT device to detect sign boards and also able to com-municate with the traffic light and makes way for ambulance. Implementation is done by detecting sign boards and measuring speed of vehicle using arduino and RF transmitter which transmits the specific beep sound to specific type of application like speed breaker, school zone etc. The vehicle also contains RF receiver and arduino, which starts receiving the beep sound when near to sign board. After receiving the code, arduino starts measuring the current speed of vehicle and if the speed is above recommended speed then it starts gives alert. If the vehicle speed is not reduced even after the alert then the vehicle will auto break. With the help of this Traffic Management System (TMS), we can record the number of users who do not reduce vehicle speed even when prompted by the system alerts.
△ Less
Submitted 5 October, 2021;
originally announced October 2021.
-
Predicting trajectory behaviour via machine-learned invariant manifolds
Authors:
Vladimír Krajňák,
Shibabrat Naik,
Stephen Wiggins
Abstract:
In this paper, we use support vector machines (SVM) to develop a machine learning framework to discover phase space structures that distinguish between distinct reaction pathways. The SVM model is trained using data from trajectories of Hamilton's equations and works well even with relatively few trajectories. Moreover, this framework is specifically designed to require minimal a priori knowledge…
▽ More
In this paper, we use support vector machines (SVM) to develop a machine learning framework to discover phase space structures that distinguish between distinct reaction pathways. The SVM model is trained using data from trajectories of Hamilton's equations and works well even with relatively few trajectories. Moreover, this framework is specifically designed to require minimal a priori knowledge of the dynamics in a system. This makes our approach computationally better suited than existing methods for high-dimensional systems and systems where integrating trajectories is expensive. We benchmark our approach on Chesnavich's CH$_4^+$ Hamiltonian.
△ Less
Submitted 5 January, 2022; v1 submitted 21 July, 2021;
originally announced July 2021.
-
Support vector machines for learning reactive islands
Authors:
Shibabrat Naik,
Vladimír Krajňák,
Stephen Wiggins
Abstract:
We develop a machine learning framework that can be applied to data sets derived from the trajectories of Hamilton's equations. The goal is to learn the phase space structures that play the governing role for phase space transport relevant to particular applications. Our focus is on learning reactive islands in two degrees-of-freedom Hamiltonian systems. Reactive islands are constructed from the s…
▽ More
We develop a machine learning framework that can be applied to data sets derived from the trajectories of Hamilton's equations. The goal is to learn the phase space structures that play the governing role for phase space transport relevant to particular applications. Our focus is on learning reactive islands in two degrees-of-freedom Hamiltonian systems. Reactive islands are constructed from the stable and unstable manifolds of unstable periodic orbits and play the role of quantifying transition dynamics. We show that support vector machines (SVM) is an appropriate machine learning framework for this purpose as it provides an approach for finding the boundaries between qualitatively distinct dynamical behaviors, which is in the spirit of the phase space transport framework. We show how our method allows us to find reactive islands directly in the sense that we do not have to first compute unstable periodic orbits and their stable and unstable manifolds. We apply our approach to the Hénon-Heiles Hamiltonian system, which is a benchmark system in the dynamical systems community. We discuss different sampling and learning approaches and their advantages and disadvantages.
△ Less
Submitted 18 July, 2021;
originally announced July 2021.
-
Self Organizing Nebulous Growths for Robust and Incremental Data Visualization
Authors:
Damith Senanayake,
Wei Wang,
Shalin H. Naik,
Saman Halgamuge
Abstract:
Non-parametric dimensionality reduction techniques, such as t-SNE and UMAP, are proficient in providing visualizations for datasets of fixed sizes. However, they cannot incrementally map and insert new data points into an already provided data visualization. We present Self-Organizing Nebulous Growths (SONG), a parametric nonlinear dimensionality reduction technique that supports incremental data…
▽ More
Non-parametric dimensionality reduction techniques, such as t-SNE and UMAP, are proficient in providing visualizations for datasets of fixed sizes. However, they cannot incrementally map and insert new data points into an already provided data visualization. We present Self-Organizing Nebulous Growths (SONG), a parametric nonlinear dimensionality reduction technique that supports incremental data visualization, i.e., incremental addition of new data while preserving the structure of the existing visualization. In addition, SONG is capable of handling new data increments, no matter whether they are similar or heterogeneous to the already observed data distribution. We test SONG on a variety of real and simulated datasets. The results show that SONG is superior to Parametric t-SNE, t-SNE and UMAP in incremental data visualization. Specifically, for heterogeneous increments, SONG improves over Parametric t-SNE by 14.98 % on the Fashion MNIST dataset and 49.73% on the MNIST dataset regarding the cluster quality measured by the Adjusted Mutual Information scores. On similar or homogeneous increments, the improvements are 8.36% and 42.26% respectively. Furthermore, even when the above datasets are presented all at once, SONG performs better or comparable to UMAP, and superior to t-SNE. We also demonstrate that the algorithmic foundations of SONG render it more tolerant to noise compared to UMAP and t-SNE, thus providing greater utility for data with high variance, high mixing of clusters, or noise.
△ Less
Submitted 1 October, 2020; v1 submitted 9 December, 2019;
originally announced December 2019.
-
An Efficient Reconfigurable FIR Digital Filter Using Modified Distribute Arithmetic Technique
Authors:
Naveen S Naik,
Kiran A Gupta
Abstract:
This paper provides modified Distributed Arithmetic based technique to compute sum of products saving appreciable number of Multiply And accumulation blocks and this consecutively reduces circuit size. In this technique multiplexer based structure is used to reuse the blocks so as to reduce the required memory locations. In this technique a Carry Look Ahead based adder tree is used to have better…
▽ More
This paper provides modified Distributed Arithmetic based technique to compute sum of products saving appreciable number of Multiply And accumulation blocks and this consecutively reduces circuit size. In this technique multiplexer based structure is used to reuse the blocks so as to reduce the required memory locations. In this technique a Carry Look Ahead based adder tree is used to have better area-delay product. Designing of FIR filter is done using VHDL and synthesized using Xilinx 12.2 synthesis tool and ISIM simulator. The power analysis is done using Xilinx Xpower analyzer. The proposed structure requires nearly 42% less cells, 40% less LUT flip-flop pairs used, and also 2% less power compared with existing structure.
△ Less
Submitted 27 April, 2017;
originally announced April 2017.
-
A Bayesian Network approach to County-Level Corn Yield Prediction using historical data and expert knowledge
Authors:
Vikas Chawla,
Hsiang Sing Naik,
Adedotun Akintayo,
Dermot Hayes,
Patrick Schnable,
Baskar Ganapathysubramanian,
Soumik Sarkar
Abstract:
Crop yield forecasting is the methodology of predicting crop yields prior to harvest. The availability of accurate yield prediction frameworks have enormous implications from multiple standpoints, including impact on the crop commodity futures markets, formulation of agricultural policy, as well as crop insurance rating. The focus of this work is to construct a corn yield predictor at the county s…
▽ More
Crop yield forecasting is the methodology of predicting crop yields prior to harvest. The availability of accurate yield prediction frameworks have enormous implications from multiple standpoints, including impact on the crop commodity futures markets, formulation of agricultural policy, as well as crop insurance rating. The focus of this work is to construct a corn yield predictor at the county scale. Corn yield (forecasting) depends on a complex, interconnected set of variables that include economic, agricultural, management and meteorological factors. Conventional forecasting is either knowledge-based computer programs (that simulate plant-weather-soil-management interactions) coupled with targeted surveys or statistical model based. The former is limited by the need for painstaking calibration, while the latter is limited to univariate analysis or similar simplifying assumptions that fail to capture the complex interdependencies affecting yield. In this paper, we propose a data-driven approach that is "gray box" i.e. that seamlessly utilizes expert knowledge in constructing a statistical network model for corn yield forecasting. Our multivariate gray box model is developed on Bayesian network analysis to build a Directed Acyclic Graph (DAG) between predictors and yield. Starting from a complete graph connecting various carefully chosen variables and yield, expert knowledge is used to prune or strengthen edges connecting variables. Subsequently the structure (connectivity and edge weights) of the DAG that maximizes the likelihood of observing the training data is identified via optimization. We curated an extensive set of historical data (1948-2012) for each of the 99 counties in Iowa as data to train the model.
△ Less
Submitted 17 August, 2016;
originally announced August 2016.
-
Single image super resolution in spatial and wavelet domain
Authors:
Sapan Naik,
Nikunj Patel
Abstract:
Recently single image super resolution is very important research area to generate high resolution image from given low resolution image. Algorithms of single image resolution are mainly based on wavelet domain and spatial domain. Filters support to model the regularity of natural images is exploited in wavelet domain while edges of images get sharp during up sampling in spatial domain. Here singl…
▽ More
Recently single image super resolution is very important research area to generate high resolution image from given low resolution image. Algorithms of single image resolution are mainly based on wavelet domain and spatial domain. Filters support to model the regularity of natural images is exploited in wavelet domain while edges of images get sharp during up sampling in spatial domain. Here single image super resolution algorithm is presented which based on both spatial and wavelet domain and take the advantage of both. Algorithm is iterative and use back projection to minimize reconstruction error. Wavelet based denoising method is also introduced to remove noise.
△ Less
Submitted 9 September, 2013;
originally announced September 2013.
-
A Unified Mechanism Design Framework for Networked Systems
Authors:
Tansu Alpcan,
Holger Boche,
Siddharth Naik
Abstract:
Mechanisms such as auctions and pricing schemes are utilized to design strategic (noncooperative) games for networked systems. Although the participating players are selfish, these mechanisms ensure that the game outcome is optimal with respect to a global criterion (e.g. maximizing a social welfare function), preference-compatible, and strategy-proof, i.e. players have no reason to deceive the de…
▽ More
Mechanisms such as auctions and pricing schemes are utilized to design strategic (noncooperative) games for networked systems. Although the participating players are selfish, these mechanisms ensure that the game outcome is optimal with respect to a global criterion (e.g. maximizing a social welfare function), preference-compatible, and strategy-proof, i.e. players have no reason to deceive the designer. The mechanism designer achieves these objectives by introducing specific rules and incentives to the players; in this case by adding resource prices to their utilities. In auction-based mechanisms, the mechanism designer explicitly allocates the resources based on bids of the participants in addition to setting prices. Alternatively, pricing mechanisms enforce global objectives only by charging the players for the resources they have utilized. In either setting, the player preferences represented by utility functions may be coupled or decoupled, i.e. they depend on other player's actions or only on player's own actions, respectively. The unified framework and its information structures are illustrated through multiple example resource allocation problems from wireless and wired networks.
△ Less
Submitted 2 September, 2010;
originally announced September 2010.