-
TumorHoPe2: An updated database for Tumor Homing Peptides
Authors:
Diksha Kashyap,
Devanshi Gupta,
Naman Kumar Mehta,
Gajendra P. S. Raghava
Abstract:
Addressing the growing need for organized data on tumor homing peptides (THPs), we present TumorHoPe2, a manually curated database offering extensive details on experimentally validated THPs. This represents a significant update to TumorHoPe, originally developed by our group in 2012. TumorHoPe2 now contains 1847 entries, representing 1297 unique tumor homing peptides, a substantial expansion from…
▽ More
Addressing the growing need for organized data on tumor homing peptides (THPs), we present TumorHoPe2, a manually curated database offering extensive details on experimentally validated THPs. This represents a significant update to TumorHoPe, originally developed by our group in 2012. TumorHoPe2 now contains 1847 entries, representing 1297 unique tumor homing peptides, a substantial expansion from the 744 entries in its predecessor. For each peptide, the database provides critical information, including sequence, terminal or chemical modifications, corresponding cancer cell lines, and specific tumor types targeted. The database compiles data from two primary sources: phage display libraries, which are commonly used to identify peptide ligands targeting tumor-specific markers, and synthetic peptides, which are chemically modified to enhance properties such as stability, binding affinity, and specificity. Our dataset includes 594 chemically modified peptides, with 255 having N-terminal and 195 C-terminal modifications. These THPs have been validated against 172 cancer cell lines and demonstrate specificity for 37 distinct tumor types. To maximize utility for the research community, TumorHoPe2 is equipped with intuitive tools for data searching, filtering, and analysis, alongside a RESTful API for efficient programmatic access and integration into bioinformatics pipelines. It is freely available at https://webs.iiitd.edu.in/raghava/tumorhope2/
△ Less
Submitted 27 May, 2025;
originally announced May 2025.
-
MAP Format for Representing Chemical Modifications, Annotations, and Mutations in Protein Sequences: An Extension of the FASTA Format
Authors:
Akshay Shendre,
Naman Kumar Mehta,
Anand Singh Rathore,
Nishant Kumar,
Sumeet Patiyal,
Gajendra P. S. Raghava
Abstract:
Several formats, including FASTA, PIR, GenBank, EMBL, and GCG, have been developed for representing protein sequences composed of natural amino acids. Among these, FASTA remains the most widely used due to its simplicity and human readability. However, FASTA lacks the capability to represent chemically modified or non-natural residues, as well as structural annotations and mutations in protein var…
▽ More
Several formats, including FASTA, PIR, GenBank, EMBL, and GCG, have been developed for representing protein sequences composed of natural amino acids. Among these, FASTA remains the most widely used due to its simplicity and human readability. However, FASTA lacks the capability to represent chemically modified or non-natural residues, as well as structural annotations and mutations in protein variants. To address some of these limitations, the PEFF format was recently introduced as an extension of FASTA. Additionally, formats such as HELM and BILN have been proposed to represent amino acids and their modifications at the atomic level. Despite their advancements, these formats have not achieved widespread adoption within the bioinformatics community due to their complexity. To complement existing formats and overcome current challenges, we propose a new format called MAP (Modification and Annotation in Proteins), which enables comprehensive annotation of protein sequences. MAP introduces meta tags in the header for protein-level annotations and inline tags within the sequence for residue-level modifications. In this format, standard one-letter amino acid codes are augmented with curly-brace tags to denote various modifications, including phosphorylation, acetylation, non-natural residues, cyclization, and other residue-specific features. The header metadata also captures information such as organism, function, and sequence variants. We describe the structure, objectives, and capabilities of the MAP format and demonstrate its application in bioinformatics, particularly in the domain of protein therapeutics. To facilitate community adoption, we are developing a comprehensive suite of MAP-format resources, including a detailed manual, annotated datasets, and conversion tools, available at http://webs.iiitd.edu.in/raghava/maprepo/.
△ Less
Submitted 6 May, 2025;
originally announced May 2025.
-
A Multimodal Dataset for Enhancing Industrial Task Monitoring and Engagement Prediction
Authors:
Naval Kishore Mehta,
Arvind,
Himanshu Kumar,
Abeer Banerjee,
Sumeet Saurav,
Sanjay Singh
Abstract:
Detecting and interpreting operator actions, engagement, and object interactions in dynamic industrial workflows remains a significant challenge in human-robot collaboration research, especially within complex, real-world environments. Traditional unimodal methods often fall short of capturing the intricacies of these unstructured industrial settings. To address this gap, we present a novel Multim…
▽ More
Detecting and interpreting operator actions, engagement, and object interactions in dynamic industrial workflows remains a significant challenge in human-robot collaboration research, especially within complex, real-world environments. Traditional unimodal methods often fall short of capturing the intricacies of these unstructured industrial settings. To address this gap, we present a novel Multimodal Industrial Activity Monitoring (MIAM) dataset that captures realistic assembly and disassembly tasks, facilitating the evaluation of key meta-tasks such as action localization, object interaction, and engagement prediction. The dataset comprises multi-view RGB, depth, and Inertial Measurement Unit (IMU) data collected from 22 sessions, amounting to 290 minutes of untrimmed video, annotated in detail for task performance and operator behavior. Its distinctiveness lies in the integration of multiple data modalities and its emphasis on real-world, untrimmed industrial workflows-key for advancing research in human-robot collaboration and operator monitoring. Additionally, we propose a multimodal network that fuses RGB frames, IMU data, and skeleton sequences to predict engagement levels during industrial tasks. Our approach improves the accuracy of recognizing engagement states, providing a robust solution for monitoring operator performance in dynamic industrial environments. The dataset and code can be accessed from https://github.com/navalkishoremehta95/MIAM/.
△ Less
Submitted 10 January, 2025;
originally announced January 2025.
-
Optimizing Multitask Industrial Processes with Predictive Action Guidance
Authors:
Naval Kishore Mehta,
Arvind,
Shyam Sunder Prasad,
Sumeet Saurav,
Sanjay Singh
Abstract:
Monitoring complex assembly processes is critical for maintaining productivity and ensuring compliance with assembly standards. However, variability in human actions and subjective task preferences complicate accurate task anticipation and guidance. To address these challenges, we introduce the Multi-Modal Transformer Fusion and Recurrent Units (MMTFRU) Network for egocentric activity anticipation…
▽ More
Monitoring complex assembly processes is critical for maintaining productivity and ensuring compliance with assembly standards. However, variability in human actions and subjective task preferences complicate accurate task anticipation and guidance. To address these challenges, we introduce the Multi-Modal Transformer Fusion and Recurrent Units (MMTFRU) Network for egocentric activity anticipation, utilizing multimodal fusion to improve prediction accuracy. Integrated with the Operator Action Monitoring Unit (OAMU), the system provides proactive operator guidance, preventing deviations in the assembly process. OAMU employs two strategies: (1) Top-5 MMTF-RU predictions, combined with a reference graph and an action dictionary, for next-step recommendations; and (2) Top-1 MMTF-RU predictions, integrated with a reference graph, for detecting sequence deviations and predicting anomaly scores via an entropy-informed confidence mechanism. We also introduce Time-Weighted Sequence Accuracy (TWSA) to evaluate operator efficiency and ensure timely task completion. Our approach is validated on the industrial Meccano dataset and the largescale EPIC-Kitchens-55 dataset, demonstrating its effectiveness in dynamic environments.
△ Less
Submitted 9 January, 2025;
originally announced January 2025.
-
Gaze-Vector Estimation in the Dark with Temporally Encoded Event-driven Neural Networks
Authors:
Abeer Banerjee,
Naval K. Mehta,
Shyam S. Prasad,
Himanshu,
Sumeet Saurav,
Sanjay Singh
Abstract:
In this paper, we address the intricate challenge of gaze vector prediction, a pivotal task with applications ranging from human-computer interaction to driver monitoring systems. Our innovative approach is designed for the demanding setting of extremely low-light conditions, leveraging a novel temporal event encoding scheme, and a dedicated neural network architecture. The temporal encoding metho…
▽ More
In this paper, we address the intricate challenge of gaze vector prediction, a pivotal task with applications ranging from human-computer interaction to driver monitoring systems. Our innovative approach is designed for the demanding setting of extremely low-light conditions, leveraging a novel temporal event encoding scheme, and a dedicated neural network architecture. The temporal encoding method seamlessly integrates Dynamic Vision Sensor (DVS) events with grayscale guide frames, generating consecutively encoded images for input into our neural network. This unique solution not only captures diverse gaze responses from participants within the active age group but also introduces a curated dataset tailored for low-light conditions. The encoded temporal frames paired with our network showcase impressive spatial localization and reliable gaze direction in their predictions. Achieving a remarkable 100-pixel accuracy of 100%, our research underscores the potency of our neural network to work with temporally consecutive encoded images for precise gaze vector predictions in challenging low-light videos, contributing to the advancement of gaze prediction technologies.
△ Less
Submitted 5 March, 2024;
originally announced March 2024.