-
Fine-Grained control over Music Generation with Activation Steering
Authors:
Dipanshu Panda,
Jayden Koshy Joe,
Harshith M R,
Swathi Narashiman,
Pranay Mathur,
Anish Veerakumar,
Aniruddh Krishna,
Keerthiharan A
Abstract:
We present a method for fine-grained control over music generation through inference-time interventions on an autoregressive generative music transformer called MusicGen. Our approach enables timbre transfer, style transfer, and genre fusion by steering the residual stream using weights of linear probes trained on it, or by steering the attention layer activations in a similar manner. We observe t…
▽ More
We present a method for fine-grained control over music generation through inference-time interventions on an autoregressive generative music transformer called MusicGen. Our approach enables timbre transfer, style transfer, and genre fusion by steering the residual stream using weights of linear probes trained on it, or by steering the attention layer activations in a similar manner. We observe that modelling this as a regression task provides improved performance, hypothesizing that the mean-squared-error better preserve meaningful directional information in the activation space. Combined with the global conditioning offered by text prompts in MusicGen, our method provides both global and local control over music generation. Audio samples illustrating our method are available at our demo page.
△ Less
Submitted 11 June, 2025;
originally announced June 2025.
-
FaceCraft4D: Animated 3D Facial Avatar Generation from a Single Image
Authors:
Fei Yin,
Mallikarjun B R,
Chun-Han Yao,
Rafał Mantiuk,
Varun Jampani
Abstract:
We present a novel framework for generating high-quality, animatable 4D avatar from a single image. While recent advances have shown promising results in 4D avatar creation, existing methods either require extensive multiview data or struggle with shape accuracy and identity consistency. To address these limitations, we propose a comprehensive system that leverages shape, image, and video priors t…
▽ More
We present a novel framework for generating high-quality, animatable 4D avatar from a single image. While recent advances have shown promising results in 4D avatar creation, existing methods either require extensive multiview data or struggle with shape accuracy and identity consistency. To address these limitations, we propose a comprehensive system that leverages shape, image, and video priors to create full-view, animatable avatars. Our approach first obtains initial coarse shape through 3D-GAN inversion. Then, it enhances multiview textures using depth-guided warping signals for cross-view consistency with the help of the image diffusion model. To handle expression animation, we incorporate a video prior with synchronized driving signals across viewpoints. We further introduce a Consistent-Inconsistent training to effectively handle data inconsistencies during 4D reconstruction. Experimental results demonstrate that our method achieves superior quality compared to the prior art, while maintaining consistency across different viewpoints and expressions.
△ Less
Submitted 21 April, 2025;
originally announced April 2025.
-
LTLCodeGen: Code Generation of Syntactically Correct Temporal Logic for Robot Task Planning
Authors:
Behrad Rabiei,
Mahesh Kumar A. R.,
Zhirui Dai,
Surya L. S. R. Pilla,
Qiyue Dong,
Nikolay Atanasov
Abstract:
This paper focuses on planning robot navigation tasks from natural language specifications. We develop a modular approach, where a large language model (LLM) translates the natural language instructions into a linear temporal logic (LTL) formula with propositions defined by object classes in a semantic occupancy map. The LTL formula and the semantic occupancy map are provided to a motion planning…
▽ More
This paper focuses on planning robot navigation tasks from natural language specifications. We develop a modular approach, where a large language model (LLM) translates the natural language instructions into a linear temporal logic (LTL) formula with propositions defined by object classes in a semantic occupancy map. The LTL formula and the semantic occupancy map are provided to a motion planning algorithm to generate a collision-free robot path that satisfies the natural language instructions. Our main contribution is LTLCodeGen, a method to translate natural language to syntactically correct LTL using code generation. We demonstrate the complete task planning method in real-world experiments involving human speech to provide navigation instructions to a mobile robot. We also thoroughly evaluate our approach in simulated and real-world experiments in comparison to end-to-end LLM task planning and state-of-the-art LLM-to-LTL translation methods.
△ Less
Submitted 10 March, 2025;
originally announced March 2025.
-
Lite2Relight: 3D-aware Single Image Portrait Relighting
Authors:
Pramod Rao,
Gereon Fox,
Abhimitra Meka,
Mallikarjun B R,
Fangneng Zhan,
Tim Weyrich,
Bernd Bickel,
Hanspeter Pfister,
Wojciech Matusik,
Mohamed Elgharib,
Christian Theobalt
Abstract:
Achieving photorealistic 3D view synthesis and relighting of human portraits is pivotal for advancing AR/VR applications. Existing methodologies in portrait relighting demonstrate substantial limitations in terms of generalization and 3D consistency, coupled with inaccuracies in physically realistic lighting and identity preservation. Furthermore, personalization from a single view is difficult to…
▽ More
Achieving photorealistic 3D view synthesis and relighting of human portraits is pivotal for advancing AR/VR applications. Existing methodologies in portrait relighting demonstrate substantial limitations in terms of generalization and 3D consistency, coupled with inaccuracies in physically realistic lighting and identity preservation. Furthermore, personalization from a single view is difficult to achieve and often requires multiview images during the testing phase or involves slow optimization processes.
This paper introduces Lite2Relight, a novel technique that can predict 3D consistent head poses of portraits while performing physically plausible light editing at interactive speed. Our method uniquely extends the generative capabilities and efficient volumetric representation of EG3D, leveraging a lightstage dataset to implicitly disentangle face reflectance and perform relighting under target HDRI environment maps. By utilizing a pre-trained geometry-aware encoder and a feature alignment module, we map input images into a relightable 3D space, enhancing them with a strong face geometry and reflectance prior.
Through extensive quantitative and qualitative evaluations, we show that our method outperforms the state-of-the-art methods in terms of efficacy, photorealism, and practical application. This includes producing 3D-consistent results of the full head, including hair, eyes, and expressions. Lite2Relight paves the way for large-scale adoption of photorealistic portrait editing in various domains, offering a robust, interactive solution to a previously constrained problem. Project page: https://vcai.mpi-inf.mpg.de/projects/Lite2Relight/
△ Less
Submitted 15 July, 2024;
originally announced July 2024.
-
Fine-grained large-scale content recommendations for MSX sellers
Authors:
Manpreet Singh,
Ravdeep Pasricha,
Ravi Prasad Kondapalli,
Kiran R,
Nitish Singh,
Akshita Agarwalla,
Manoj R,
Manish Prabhakar,
Laurent Boué
Abstract:
One of the most critical tasks of Microsoft sellers is to meticulously track and nurture potential business opportunities through proactive engagement and tailored solutions. Recommender systems play a central role to help sellers achieve their goals. In this paper, we present a content recommendation model which surfaces various types of content (technical documentation, comparison with competito…
▽ More
One of the most critical tasks of Microsoft sellers is to meticulously track and nurture potential business opportunities through proactive engagement and tailored solutions. Recommender systems play a central role to help sellers achieve their goals. In this paper, we present a content recommendation model which surfaces various types of content (technical documentation, comparison with competitor products, customer success stories etc.) that sellers can share with their customers or use for their own self-learning. The model operates at the opportunity level which is the lowest possible granularity and the most relevant one for sellers. It is based on semantic matching between metadata from the contents and carefully selected attributes of the opportunities. Considering the volume of seller-managed opportunities in organizations such as Microsoft, we show how to perform efficient semantic matching over a very large number of opportunity-content combinations. The main challenge is to ensure that the top-5 relevant contents for each opportunity are recommended out of a total of $\approx 40,000$ published contents. We achieve this target through an extensive comparison of different model architectures and feature selection. Finally, we further examine the quality of the recommendations in a quantitative manner using a combination of human domain experts as well as by using the recently proposed "LLM as a judge" framework.
△ Less
Submitted 9 July, 2024;
originally announced July 2024.
-
Student Perspectives on Using a Large Language Model (LLM) for an Assignment on Professional Ethics
Authors:
Virginia Grande,
Natalie Kiesler,
Maria Andreina Francisco R
Abstract:
The advent of Large Language Models (LLMs) started a serious discussion among educators on how LLMs would affect, e.g., curricula, assessments, and students' competencies. Generative AI and LLMs also raised ethical questions and concerns for computing educators and professionals. This experience report presents an assignment within a course on professional competencies, including some related to e…
▽ More
The advent of Large Language Models (LLMs) started a serious discussion among educators on how LLMs would affect, e.g., curricula, assessments, and students' competencies. Generative AI and LLMs also raised ethical questions and concerns for computing educators and professionals. This experience report presents an assignment within a course on professional competencies, including some related to ethics, that computing master's students need in their careers. For the assignment, student groups discussed the ethical process by Lennerfors et al. by analyzing a case: a fictional researcher considers whether to attend the real CHI 2024 conference in Hawaii. The tasks were (1) to participate in in-class discussions on the case, (2) to use an LLM of their choice as a discussion partner for said case, and (3) to document both discussions, reflecting on their use of the LLM. Students reported positive experiences with the LLM as a way to increase their knowledge and understanding, although some identified limitations. The LLM provided a wider set of options for action in the studied case, including unfeasible ones. The LLM would not select a course of action, so students had to choose themselves, which they saw as coherent. From the educators' perspective, there is a need for more instruction for students using LLMs: some students did not perceive the tools as such but rather as an authoritative knowledge base. Therefore, this work has implications for educators considering the use of LLMs as discussion partners or tools to practice critical thinking, especially in computing ethics education.
△ Less
Submitted 9 April, 2024;
originally announced June 2024.
-
Navigating Tabular Data Synthesis Research: Understanding User Needs and Tool Capabilities
Authors:
Maria F. Davila R.,
Sven Groen,
Fabian Panse,
Wolfram Wingerath
Abstract:
In an era of rapidly advancing data-driven applications, there is a growing demand for data in both research and practice. Synthetic data have emerged as an alternative when no real data is available (e.g., due to privacy regulations). Synthesizing tabular data presents unique and complex challenges, especially handling (i) missing values, (ii) dataset imbalance, (iii) diverse column types, and (i…
▽ More
In an era of rapidly advancing data-driven applications, there is a growing demand for data in both research and practice. Synthetic data have emerged as an alternative when no real data is available (e.g., due to privacy regulations). Synthesizing tabular data presents unique and complex challenges, especially handling (i) missing values, (ii) dataset imbalance, (iii) diverse column types, and (iv) complex data distributions, as well as preserving (i) column correlations, (ii) temporal dependencies, and (iii) integrity constraints (e.g., functional dependencies) present in the original dataset. While substantial progress has been made recently in the context of generational models, there is no one-size-fits-all solution for tabular data today, and choosing the right tool for a given task is therefore no trivial task. In this paper, we survey the state of the art in Tabular Data Synthesis (TDS), examine the needs of users by defining a set of functional and non-functional requirements, and compile the challenges associated with meeting those needs. In addition, we evaluate the reported performance of 36 popular research TDS tools about these requirements and develop a decision guide to help users find suitable TDS tools for their applications. The resulting decision guide also identifies significant research gaps.
△ Less
Submitted 31 May, 2024;
originally announced May 2024.
-
Enhanced Precision in Rainfall Forecasting for Mumbai: Utilizing Physics Informed ConvLSTM2D Models for Finer Spatial and Temporal Resolution
Authors:
Ajay Devda,
Akshay Sunil,
Murthy R,
B Deepthi
Abstract:
Forecasting rainfall in tropical areas is challenging due to complex atmospheric behaviour, elevated humidity levels, and the common presence of convective rain events. In the Indian context, the difficulty is further exacerbated because of the monsoon intra seasonal oscillations, which introduce significant variability in rainfall patterns over short periods. Earlier investigations into rainfall…
▽ More
Forecasting rainfall in tropical areas is challenging due to complex atmospheric behaviour, elevated humidity levels, and the common presence of convective rain events. In the Indian context, the difficulty is further exacerbated because of the monsoon intra seasonal oscillations, which introduce significant variability in rainfall patterns over short periods. Earlier investigations into rainfall prediction leveraged numerical weather prediction methods, along with statistical and deep learning approaches. This study introduces deep learning spatial model aimed at enhancing rainfall prediction accuracy on a finer scale. In this study, we hypothesize that integrating physical understanding improves the precipitation prediction skill of deep learning models with high precision for finer spatial scales, such as cities. To test this hypothesis, we introduce a physics informed ConvLSTM2D model to predict precipitation 6hr and 12hr ahead for Mumbai, India. We utilize ERA5 reanalysis data select predictor variables, across various geopotential levels. The ConvLSTM2D model was trained on the target variable precipitation for 4 different grids representing different spatial grid locations of Mumbai. Thus, the use of the ConvLSTM2D model for rainfall prediction, utilizing physics informed data from specific grids with limited spatial information, reflects current advancements in meteorological research that emphasize both efficiency and localized precision.
△ Less
Submitted 1 April, 2024;
originally announced April 2024.
-
A case study of Generative AI in MSX Sales Copilot: Improving seller productivity with a real-time question-answering system for content recommendation
Authors:
Manpreet Singh,
Ravdeep Pasricha,
Nitish Singh,
Ravi Prasad Kondapalli,
Manoj R,
Kiran R,
Laurent Boué
Abstract:
In this paper, we design a real-time question-answering system specifically targeted for helping sellers get relevant material/documentation they can share live with their customers or refer to during a call. Taking the Seismic content repository as a relatively large scale example of a diverse dataset of sales material, we demonstrate how LLM embeddings of sellers' queries can be matched with the…
▽ More
In this paper, we design a real-time question-answering system specifically targeted for helping sellers get relevant material/documentation they can share live with their customers or refer to during a call. Taking the Seismic content repository as a relatively large scale example of a diverse dataset of sales material, we demonstrate how LLM embeddings of sellers' queries can be matched with the relevant content. We achieve this by engineering prompts in an elaborate fashion that makes use of the rich set of meta-features available for documents and sellers. Using a bi-encoder with cross-encoder re-ranker architecture, we show how the solution returns the most relevant content recommendations in just a few seconds even for large datasets. Our recommender system is deployed as an AML endpoint for real-time inferencing and has been integrated into a Copilot interface that is now deployed in the production version of the Dynamics CRM, known as MSX, used daily by Microsoft sellers.
△ Less
Submitted 4 January, 2024;
originally announced January 2024.
-
Multimodality in Online Education: A Comparative Study
Authors:
Praneeta Immadisetty,
Pooja Rajesh,
Akshita Gupta,
Anala M R,
Soumya A,
K. N. Subramanya
Abstract:
The commencement of the decade brought along with it a grave pandemic and in response the movement of education forums predominantly into the online world. With a surge in the usage of online video conferencing platforms and tools to better gauge student understanding, there needs to be a mechanism to assess whether instructors can grasp the extent to which students understand the subject and thei…
▽ More
The commencement of the decade brought along with it a grave pandemic and in response the movement of education forums predominantly into the online world. With a surge in the usage of online video conferencing platforms and tools to better gauge student understanding, there needs to be a mechanism to assess whether instructors can grasp the extent to which students understand the subject and their response to the educational stimuli. The current systems consider only a single cue with a lack of focus in the educational domain. Thus, there is a necessity for the measurement of an all-encompassing holistic overview of the students' reaction to the subject matter. This paper highlights the need for a multimodal approach to affect recognition and its deployment in the online classroom while considering four cues, posture and gesture, facial, eye tracking and verbal recognition. It compares the various machine learning models available for each cue and provides the most suitable approach given the available dataset and parameters of classroom footage. A multimodal approach derived from weighted majority voting is proposed by combining the most fitting models from this analysis of individual cues based on accuracy, ease of procuring data corpus, sensitivity and any major drawbacks.
△ Less
Submitted 17 December, 2023; v1 submitted 10 December, 2023;
originally announced December 2023.
-
Development of a Legal Document AI-Chatbot
Authors:
Pranav Nataraj Devaraj,
Rakesh Teja P V,
Aaryav Gangrade,
Manoj Kumar R
Abstract:
With the exponential growth of digital data and the increasing complexity of legal documentation, there is a pressing need for efficient and intelligent tools to streamline the handling of legal documents.With the recent developments in the AI field, especially in chatbots, it cannot be ignored as a very compelling solution to this problem.An insight into the process of creating a Legal Documentat…
▽ More
With the exponential growth of digital data and the increasing complexity of legal documentation, there is a pressing need for efficient and intelligent tools to streamline the handling of legal documents.With the recent developments in the AI field, especially in chatbots, it cannot be ignored as a very compelling solution to this problem.An insight into the process of creating a Legal Documentation AI Chatbot with as many relevant features as possible within the given time frame is presented.The development of each component of the chatbot is presented in detail.Each component's workings and functionality has been discussed.Starting from the build of the Android app and the Langchain query processing code till the integration of both through a Flask backend and REST API methods.
△ Less
Submitted 21 November, 2023;
originally announced November 2023.
-
Malware Classification using Deep Neural Networks: Performance Evaluation and Applications in Edge Devices
Authors:
Akhil M R,
Adithya Krishna V Sharma,
Harivardhan Swamy,
Pavan A,
Ashray Shetty,
Anirudh B Sathyanarayana
Abstract:
With the increasing extent of malware attacks in the present day along with the difficulty in detecting modern malware, it is necessary to evaluate the effectiveness and performance of Deep Neural Networks (DNNs) for malware classification. Multiple DNN architectures can be designed and trained to detect and classify malware binaries. Results demonstrate the potential of DNNs in accurately classif…
▽ More
With the increasing extent of malware attacks in the present day along with the difficulty in detecting modern malware, it is necessary to evaluate the effectiveness and performance of Deep Neural Networks (DNNs) for malware classification. Multiple DNN architectures can be designed and trained to detect and classify malware binaries. Results demonstrate the potential of DNNs in accurately classifying malware with high accuracy rates observed across different malware types. Additionally, the feasibility of deploying these DNN models on edge devices to enable real-time classification, particularly in resource-constrained scenarios proves to be integral to large IoT systems. By optimizing model architectures and leveraging edge computing capabilities, the proposed methodologies achieve efficient performance even with limited resources. This study contributes to advancing malware detection techniques and emphasizes the significance of integrating cybersecurity measures for the early detection of malware and further preventing the adverse effects caused by such attacks. Optimal considerations regarding the distribution of security tasks to edge devices are addressed to ensure that the integrity and availability of large scale IoT systems are not compromised due to malware attacks, advocating for a more resilient and secure digital ecosystem.
△ Less
Submitted 21 August, 2023;
originally announced October 2023.
-
AvatarStudio: Text-driven Editing of 3D Dynamic Human Head Avatars
Authors:
Mohit Mendiratta,
Xingang Pan,
Mohamed Elgharib,
Kartik Teotia,
Mallikarjun B R,
Ayush Tewari,
Vladislav Golyanik,
Adam Kortylewski,
Christian Theobalt
Abstract:
Capturing and editing full head performances enables the creation of virtual characters with various applications such as extended reality and media production. The past few years witnessed a steep rise in the photorealism of human head avatars. Such avatars can be controlled through different input data modalities, including RGB, audio, depth, IMUs and others. While these data modalities provide…
▽ More
Capturing and editing full head performances enables the creation of virtual characters with various applications such as extended reality and media production. The past few years witnessed a steep rise in the photorealism of human head avatars. Such avatars can be controlled through different input data modalities, including RGB, audio, depth, IMUs and others. While these data modalities provide effective means of control, they mostly focus on editing the head movements such as the facial expressions, head pose and/or camera viewpoint. In this paper, we propose AvatarStudio, a text-based method for editing the appearance of a dynamic full head avatar. Our approach builds on existing work to capture dynamic performances of human heads using neural radiance field (NeRF) and edits this representation with a text-to-image diffusion model. Specifically, we introduce an optimization strategy for incorporating multiple keyframes representing different camera viewpoints and time stamps of a video performance into a single diffusion model. Using this personalized diffusion model, we edit the dynamic NeRF by introducing view-and-time-aware Score Distillation Sampling (VT-SDS) following a model-based guidance approach. Our method edits the full head in a canonical space, and then propagates these edits to remaining time steps via a pretrained deformation network. We evaluate our method visually and numerically via a user study, and results show that our method outperforms existing approaches. Our experiments validate the design choices of our method and highlight that our edits are genuine, personalized, as well as 3D- and time-consistent.
△ Less
Submitted 2 June, 2023; v1 submitted 1 June, 2023;
originally announced June 2023.
-
Neuromorphic Computing with AER using Time-to-Event-Margin Propagation
Authors:
Madhuvanthi Srivatsav R,
Shantanu Chakrabartty,
Chetan Singh Thakur
Abstract:
Address-Event-Representation (AER) is a spike-routing protocol that allows the scaling of neuromorphic and spiking neural network (SNN) architectures to a size that is comparable to that of digital neural network architectures. However, in conventional neuromorphic architectures, the AER protocol and, in general, any virtual interconnect plays only a passive role in computation, i.e., only for rou…
▽ More
Address-Event-Representation (AER) is a spike-routing protocol that allows the scaling of neuromorphic and spiking neural network (SNN) architectures to a size that is comparable to that of digital neural network architectures. However, in conventional neuromorphic architectures, the AER protocol and, in general, any virtual interconnect plays only a passive role in computation, i.e., only for routing spikes and events. In this paper, we show how causal temporal primitives like delay, triggering, and sorting inherent in the AER protocol itself can be exploited for scalable neuromorphic computing using our proposed technique called Time-to-Event Margin Propagation (TEMP). The proposed TEMP-based AER architecture is fully asynchronous and relies on interconnect delays for memory and computing as opposed to conventional and local multiply-and-accumulate (MAC) operations. We show that the time-based encoding in the TEMP neural network produces a spatio-temporal representation that can encode a large number of discriminatory patterns. As a proof-of-concept, we show that a trained TEMP-based convolutional neural network (CNN) can demonstrate an accuracy greater than 99% on the MNIST dataset. Overall, our work is a biologically inspired computing paradigm that brings forth a new dimension of research to the field of neuromorphic computing.
△ Less
Submitted 26 April, 2023;
originally announced April 2023.
-
GVP: Generative Volumetric Primitives
Authors:
Mallikarjun B R,
Xingang Pan,
Mohamed Elgharib,
Christian Theobalt
Abstract:
Advances in 3D-aware generative models have pushed the boundary of image synthesis with explicit camera control. To achieve high-resolution image synthesis, several attempts have been made to design efficient generators, such as hybrid architectures with both 3D and 2D components. However, such a design compromises multiview consistency, and the design of a pure 3D generator with high resolution i…
▽ More
Advances in 3D-aware generative models have pushed the boundary of image synthesis with explicit camera control. To achieve high-resolution image synthesis, several attempts have been made to design efficient generators, such as hybrid architectures with both 3D and 2D components. However, such a design compromises multiview consistency, and the design of a pure 3D generator with high resolution is still an open problem. In this work, we present Generative Volumetric Primitives (GVP), the first pure 3D generative model that can sample and render 512-resolution images in real-time. GVP jointly models a number of volumetric primitives and their spatial information, both of which can be efficiently generated via a 2D convolutional network. The mixture of these primitives naturally captures the sparsity and correspondence in the 3D volume. The training of such a generator with a high degree of freedom is made possible through a knowledge distillation technique. Experiments on several datasets demonstrate superior efficiency and 3D consistency of GVP over the state-of-the-art.
△ Less
Submitted 31 March, 2023;
originally announced March 2023.
-
HQ3DAvatar: High Quality Controllable 3D Head Avatar
Authors:
Kartik Teotia,
Mallikarjun B R,
Xingang Pan,
Hyeongwoo Kim,
Pablo Garrido,
Mohamed Elgharib,
Christian Theobalt
Abstract:
Multi-view volumetric rendering techniques have recently shown great potential in modeling and synthesizing high-quality head avatars. A common approach to capture full head dynamic performances is to track the underlying geometry using a mesh-based template or 3D cube-based graphics primitives. While these model-based approaches achieve promising results, they often fail to learn complex geometri…
▽ More
Multi-view volumetric rendering techniques have recently shown great potential in modeling and synthesizing high-quality head avatars. A common approach to capture full head dynamic performances is to track the underlying geometry using a mesh-based template or 3D cube-based graphics primitives. While these model-based approaches achieve promising results, they often fail to learn complex geometric details such as the mouth interior, hair, and topological changes over time. This paper presents a novel approach to building highly photorealistic digital head avatars. Our method learns a canonical space via an implicit function parameterized by a neural network. It leverages multiresolution hash encoding in the learned feature space, allowing for high-quality, faster training and high-resolution rendering. At test time, our method is driven by a monocular RGB video. Here, an image encoder extracts face-specific features that also condition the learnable canonical space. This encourages deformation-dependent texture variations during training. We also propose a novel optical flow based loss that ensures correspondences in the learned canonical space, thus encouraging artifact-free and temporally consistent renderings. We show results on challenging facial expressions and show free-viewpoint renderings at interactive real-time rates for medium image resolutions. Our method outperforms all existing approaches, both visually and numerically. We will release our multiple-identity dataset to encourage further research. Our Project page is available at: https://vcai.mpi-inf.mpg.de/projects/HQ3DAvatar/
△ Less
Submitted 25 March, 2023;
originally announced March 2023.
-
LiveHand: Real-time and Photorealistic Neural Hand Rendering
Authors:
Akshay Mundra,
Mallikarjun B R,
Jiayi Wang,
Marc Habermann,
Christian Theobalt,
Mohamed Elgharib
Abstract:
The human hand is the main medium through which we interact with our surroundings, making its digitization an important problem. While there are several works modeling the geometry of hands, little attention has been paid to capturing photo-realistic appearance. Moreover, for applications in extended reality and gaming, real-time rendering is critical. We present the first neural-implicit approach…
▽ More
The human hand is the main medium through which we interact with our surroundings, making its digitization an important problem. While there are several works modeling the geometry of hands, little attention has been paid to capturing photo-realistic appearance. Moreover, for applications in extended reality and gaming, real-time rendering is critical. We present the first neural-implicit approach to photo-realistically render hands in real-time. This is a challenging problem as hands are textured and undergo strong articulations with pose-dependent effects. However, we show that this aim is achievable through our carefully designed method. This includes training on a low-resolution rendering of a neural radiance field, together with a 3D-consistent super-resolution module and mesh-guided sampling and space canonicalization. We demonstrate a novel application of perceptual loss on the image space, which is critical for learning details accurately. We also show a live demo where we photo-realistically render the human hand in real-time for the first time, while also modeling pose- and view-dependent appearance effects. We ablate all our design choices and show that they optimize for rendering speed and quality. Video results and our code can be accessed from https://vcai.mpi-inf.mpg.de/projects/LiveHand/
△ Less
Submitted 20 August, 2023; v1 submitted 15 February, 2023;
originally announced February 2023.
-
State of the Art in Dense Monocular Non-Rigid 3D Reconstruction
Authors:
Edith Tretschk,
Navami Kairanda,
Mallikarjun B R,
Rishabh Dabral,
Adam Kortylewski,
Bernhard Egger,
Marc Habermann,
Pascal Fua,
Christian Theobalt,
Vladislav Golyanik
Abstract:
3D reconstruction of deformable (or non-rigid) scenes from a set of monocular 2D image observations is a long-standing and actively researched area of computer vision and graphics. It is an ill-posed inverse problem, since -- without additional prior assumptions -- it permits infinitely many solutions leading to accurate projection to the input 2D images. Non-rigid reconstruction is a foundational…
▽ More
3D reconstruction of deformable (or non-rigid) scenes from a set of monocular 2D image observations is a long-standing and actively researched area of computer vision and graphics. It is an ill-posed inverse problem, since -- without additional prior assumptions -- it permits infinitely many solutions leading to accurate projection to the input 2D images. Non-rigid reconstruction is a foundational building block for downstream applications like robotics, AR/VR, or visual content creation. The key advantage of using monocular cameras is their omnipresence and availability to the end users as well as their ease of use compared to more sophisticated camera set-ups such as stereo or multi-view systems. This survey focuses on state-of-the-art methods for dense non-rigid 3D reconstruction of various deformable objects and composite scenes from monocular videos or sets of monocular views. It reviews the fundamentals of 3D reconstruction and deformation modeling from 2D image observations. We then start from general methods -- that handle arbitrary scenes and make only a few prior assumptions -- and proceed towards techniques making stronger assumptions about the observed objects and types of deformations (e.g. human faces, bodies, hands, and animals). A significant part of this STAR is also devoted to classification and a high-level comparison of the methods, as well as an overview of the datasets for training and evaluation of the discussed techniques. We conclude by discussing open challenges in the field and the social aspects associated with the usage of the reviewed methods.
△ Less
Submitted 24 March, 2023; v1 submitted 27 October, 2022;
originally announced October 2022.
-
O-type Stars Stellar Parameter Estimation Using Recurrent Neural Networks
Authors:
Miguel Flores R.,
Luis J. Corral,
Celia R. Fierro-Santillán,
Silvana G. Navarro
Abstract:
In this paper, we present a deep learning system approach to estimating luminosity, effective temperature, and surface gravity of O-type stars using the optical region of the stellar spectra. In previous work, we compare a set of machine learning and deep learning algorithms in order to establish a reliable way to fit a stellar model using two methods: the classification of the stellar spectra mod…
▽ More
In this paper, we present a deep learning system approach to estimating luminosity, effective temperature, and surface gravity of O-type stars using the optical region of the stellar spectra. In previous work, we compare a set of machine learning and deep learning algorithms in order to establish a reliable way to fit a stellar model using two methods: the classification of the stellar spectra models and the estimation of the physical parameters in a regression-type task. Here we present the process to estimate individual physical parameters from an artificial neural network perspective with the capacity to handle stellar spectra with a low signal-to-noise ratio (S/N), in the $<$20 S/N boundaries. The development of three different recurrent neural network systems, the training process using stellar spectra models, the test over nine different observed stellar spectra, and the comparison with estimations in previous works are presented. Additionally, characterization methods for stellar spectra in order to reduce the dimensionality of the input data for the system and optimize the computational resources are discussed.
△ Less
Submitted 27 October, 2022; v1 submitted 23 October, 2022;
originally announced October 2022.
-
Transformer-based Flood Scene Segmentation for Developing Countries
Authors:
Ahan M R,
Roshan Roy,
Shreyas Sunil Kulkarni,
Vaibhav Soni,
Ashish Chittora
Abstract:
Floods are large-scale natural disasters that often induce a massive number of deaths, extensive material damage, and economic turmoil. The effects are more extensive and longer-lasting in high-population and low-resource developing countries. Early Warning Systems (EWS) constantly assess water levels and other factors to forecast floods, to help minimize damage. Post-disaster, disaster response t…
▽ More
Floods are large-scale natural disasters that often induce a massive number of deaths, extensive material damage, and economic turmoil. The effects are more extensive and longer-lasting in high-population and low-resource developing countries. Early Warning Systems (EWS) constantly assess water levels and other factors to forecast floods, to help minimize damage. Post-disaster, disaster response teams undertake a Post Disaster Needs Assessment (PDSA) to assess structural damage and determine optimal strategies to respond to highly affected neighbourhoods. However, even today in developing countries, EWS and PDSA analysis of large volumes of image and video data is largely a manual process undertaken by first responders and volunteers. We propose FloodTransformer, which to the best of our knowledge, is the first visual transformer-based model to detect and segment flooded areas from aerial images at disaster sites. We also propose a custom metric, Flood Capacity (FC) to measure the spatial extent of water coverage and quantify the segmented flooded area for EWS and PDSA analyses. We use the SWOC Flood segmentation dataset and achieve 0.93 mIoU, outperforming all other methods. We further show the robustness of this approach by validating across unseen flood images from other flood data sources.
△ Less
Submitted 9 October, 2022;
originally announced October 2022.
-
An Interface for Variational Quantum Eigensolver based Energy (VQE-E) and Force (VQE-F) Calculator to Atomic Simulation Environment (ASE)
Authors:
Nirmal M R,
Shampa Sarkar,
Manoj Nambiar
Abstract:
The development of quantum algorithms to solve quantum chemistry problems has offered a promising new paradigm of performing computer simulations at the scale of atoms and molecules. Although majority of the research so far has focused on designing quantum algorithms to compute ground and excited state energies and forces, it is useful to run different simulation tasks, such as geometry optimizati…
▽ More
The development of quantum algorithms to solve quantum chemistry problems has offered a promising new paradigm of performing computer simulations at the scale of atoms and molecules. Although majority of the research so far has focused on designing quantum algorithms to compute ground and excited state energies and forces, it is useful to run different simulation tasks, such as geometry optimization, with these algorithms as subroutines. Towards this end, we have created an interface for the Variational Quantum Eigensolver based molecular Energy (VQE-E) and molecular Force (VQE-F) code to the Atomic Simulation Environment (ASE). We demonstrate the working of this hybrid quantum-classical interface by optimizing the geometry of water molecule using a native optimizer implemented in ASE. Furthermore, this interface enables one to compare, combine and use quantum algorithms in conjunction with related classical methods quite easily with minimal coding effort.
△ Less
Submitted 28 September, 2022;
originally announced September 2022.
-
Blockchain based digital vaccine passport
Authors:
Ms. Megha Rani R,
Roshan R Acharya,
Ramkishan,
Ranjith K,
Rakshith Ay Gowda
Abstract:
Travel has been challenging recently since different nations have implemented varied immigration and travel policies. For the time being, immigration officials want proof of each person's immunity to the virus. A vaccine passport serves as evidence that a person has tested negative for or is immune to a particular virus. In terms of COVID-19, those who hold a vaccine passport will be permitted ent…
▽ More
Travel has been challenging recently since different nations have implemented varied immigration and travel policies. For the time being, immigration officials want proof of each person's immunity to the virus. A vaccine passport serves as evidence that a person has tested negative for or is immune to a particular virus. In terms of COVID-19, those who hold a vaccine passport will be permitted entry into other nations as long as they can provide proof that they have COVID-19 antibodies from prior infections or from full COVID-19 immunizations. To reduce time and effort spent managing data, the vaccination passport system has been digitalized. The process of contact tracing may be facilitated by digitization. The "Blockchain technology" system, which is currently in use, has demonstrated its security and privacy in systems for data exchange among bitcoin users. The Digital Vaccination Passport scheme can use Blockchain technology. The end result would be a decentralized, traceable, transparent, reliable, auditable, secure, and trustworthy solution based on the Ethereum block-chain that would allow tracking of vaccines given and the history of diseases.
△ Less
Submitted 18 August, 2022;
originally announced August 2022.
-
Designing Interference-Immune Doppler-TolerantWaveforms for Automotive Radar Applications
Authors:
Robin Amar,
Mohammad Alaee-Kerahroodi,
Prabhu Babu,
Bhavani Shankar M. R
Abstract:
Dynamic target detection using FMCW waveform is challenging in the presence of interference for different radar applications. Degradation in SNR is irreparable and interference is difficult to mitigate in time and frequency domain. In this paper, a waveform design problem is addressed using the Majorization-Minimization (MM) framework by considering PSL/ISL cost functions, resulting in a code sequ…
▽ More
Dynamic target detection using FMCW waveform is challenging in the presence of interference for different radar applications. Degradation in SNR is irreparable and interference is difficult to mitigate in time and frequency domain. In this paper, a waveform design problem is addressed using the Majorization-Minimization (MM) framework by considering PSL/ISL cost functions, resulting in a code sequence with Doppler-tolerance characteristics of an FMCW waveform and interference immune characteristics of a tailored PMCW waveform (unique phase code + minimal ISL/PSL). The optimal design sequences possess polynomial phase behavior of degree Q amongst its sub-sequences and obtain optimal ISL and PSL solutions with guaranteed convergence. By tuning the optimization parameters such as degree Q of the polynomial phase behavior, sub-sequence length M and the total number of sub-sequences L, the optimized sequences can be as Doppler tolerant as FMCW waveform in one end, and they can possess small cross-correlation values similar to random-phase sequences in PMCW waveform on the other end. If required in the event of acute interference, new codes can be generated in the runtime which have low cross-correlation with the interferers. The performance analysis indicates that the proposed method outperforms the state-of-the-art counterparts.
△ Less
Submitted 5 April, 2022;
originally announced April 2022.
-
Disentangled3D: Learning a 3D Generative Model with Disentangled Geometry and Appearance from Monocular Images
Authors:
Ayush Tewari,
Mallikarjun B R,
Xingang Pan,
Ohad Fried,
Maneesh Agrawala,
Christian Theobalt
Abstract:
Learning 3D generative models from a dataset of monocular images enables self-supervised 3D reasoning and controllable synthesis. State-of-the-art 3D generative models are GANs which use neural 3D volumetric representations for synthesis. Images are synthesized by rendering the volumes from a given camera. These models can disentangle the 3D scene from the camera viewpoint in any generated image.…
▽ More
Learning 3D generative models from a dataset of monocular images enables self-supervised 3D reasoning and controllable synthesis. State-of-the-art 3D generative models are GANs which use neural 3D volumetric representations for synthesis. Images are synthesized by rendering the volumes from a given camera. These models can disentangle the 3D scene from the camera viewpoint in any generated image. However, most models do not disentangle other factors of image formation, such as geometry and appearance. In this paper, we design a 3D GAN which can learn a disentangled model of objects, just from monocular observations. Our model can disentangle the geometry and appearance variations in the scene, i.e., we can independently sample from the geometry and appearance spaces of the generative model. This is achieved using a novel non-rigid deformable scene formulation. A 3D volume which represents an object instance is computed as a non-rigidly deformed canonical 3D volume. Our method learns the canonical volume, as well as its deformations, jointly during training. This formulation also helps us improve the disentanglement between the 3D scene and the camera viewpoints using a novel pose regularization loss defined on the 3D deformation field. In addition, we further model the inverse deformations, enabling the computation of dense correspondences between images generated by our model. Finally, we design an approach to embed real images into the latent space of our disentangled generative model, enabling editing of real images.
△ Less
Submitted 29 March, 2022;
originally announced March 2022.
-
Performance evaluation of the QOS provisioning ability of IEEE 802.11e WLAN standard for multimedia traffic
Authors:
Venkata Sitaram. A,
Venkatesh. T. G,
Arun George,
Manivasakan. R,
Bhasker Dappuri
Abstract:
This paper presents an analytical model for the average frame transmission delay and the jitter for the different Access Categories (ACs) of the IEEE 802.11e Enhanced Distributed Channel Access (EDCA) mechanism. Following are the salient features of our model. As defined by the standard we consider (1) the virtual collisions among different ACs inside each EDCA station in addition to external coll…
▽ More
This paper presents an analytical model for the average frame transmission delay and the jitter for the different Access Categories (ACs) of the IEEE 802.11e Enhanced Distributed Channel Access (EDCA) mechanism. Following are the salient features of our model. As defined by the standard we consider (1) the virtual collisions among different ACs inside each EDCA station in addition to external collisions. (2) the effect of priority parameters, such as minimum and maximum values of Contention Window (CW) sizes, Arbitration Inter Frame Space (AIFS). (3) the role of Transmission Opportunity (TXOP) of different ACs. (4) the finite number of retrials a packet experiences before being dropped. Our model and analytical results provide an in-depth understanding of the EDCA mechanism and the effect of Quality of Service (QoS) parameters in the performance of IEEE 802.11e protocol.
△ Less
Submitted 14 December, 2021;
originally announced December 2021.
-
IR Motion Deblurring
Authors:
Nisha Varghese,
Mahesh Mohan M. R.,
A. N. Rajagopalan
Abstract:
Camera gimbal systems are important in various air or water borne systems for applications such as navigation, target tracking, security and surveillance. A higher steering rate (rotation angle per second) of gimbal is preferable for real-time applications since a given field-of-view (FOV) can be revisited within a short period of time. However, due to relative motion between the gimbal and scene…
▽ More
Camera gimbal systems are important in various air or water borne systems for applications such as navigation, target tracking, security and surveillance. A higher steering rate (rotation angle per second) of gimbal is preferable for real-time applications since a given field-of-view (FOV) can be revisited within a short period of time. However, due to relative motion between the gimbal and scene during the exposure time, the captured video frames can suffer from motion blur. Since most of the post-capture applications require blurfree images, motion deblurring in real-time is an important need. Even though there exist blind deblurring methods which aim to retrieve latent images from blurry inputs, they are constrained by very high-dimensional optimization thus incurring large execution times. On the other hand, deep learning methods for motion deblurring, though fast, do not generalize satisfactorily to different domains (e.g., air, water, etc). In this work, we address the problem of real-time motion deblurring in infrared (IR) images captured by a gimbal-based system. We reveal how a priori knowledge of the blur-kernel can be used in conjunction with non-blind deblurring methods to achieve real-time performance. Importantly, our mathematical model can be leveraged to create large-scale datasets with realistic gimbal motion blur. Such datasets which are a rarity can be a valuable asset for contemporary deep learning methods. We show that, in comparison to the state-of-the-art techniques in deblurring, our method is better suited for practical gimbal-based imaging systems.
△ Less
Submitted 23 November, 2021;
originally announced November 2021.
-
NASA Space Robotics Challenge 2 Qualification Round: An Approach to Autonomous Lunar Rover Operations
Authors:
Cagri Kilic,
Bernardo Martinez R. Jr.,
Christopher A. Tatsch,
Jared Beard,
Jared Strader,
Shounak Das,
Derek Ross,
Yu Gu,
Guilherme A. S. Pereira,
Jason N. Gross
Abstract:
Plans for establishing a long-term human presence on the Moon will require substantial increases in robot autonomy and multi-robot coordination to support establishing a lunar outpost. To achieve these objectives, algorithm design choices for the software developments need to be tested and validated for expected scenarios such as autonomous in-situ resource utilization (ISRU), localization in chal…
▽ More
Plans for establishing a long-term human presence on the Moon will require substantial increases in robot autonomy and multi-robot coordination to support establishing a lunar outpost. To achieve these objectives, algorithm design choices for the software developments need to be tested and validated for expected scenarios such as autonomous in-situ resource utilization (ISRU), localization in challenging environments, and multi-robot coordination. However, real-world experiments are extremely challenging and limited for extraterrestrial environment. Also, realistic simulation demonstrations in these environments are still rare and demanded for initial algorithm testing capabilities. To help some of these needs, the NASA Centennial Challenges program established the Space Robotics Challenge Phase 2 (SRC2) which consist of virtual robotic systems in a realistic lunar simulation environment, where a group of mobile robots were tasked with reporting volatile locations within a global map, excavating and transporting these resources, and detecting and localizing a target of interest. The main goal of this article is to share our team's experiences on the design trade-offs to perform autonomous robotic operations in a virtual lunar environment and to share strategies to complete the mission requirements posed by NASA SRC2 competition during the qualification round. Of the 114 teams that registered for participation in the NASA SRC2, team Mountaineers finished as one of only six teams to receive the top qualification round prize.
△ Less
Submitted 20 September, 2021;
originally announced September 2021.
-
Heterogeneously-Distributed Joint Radar Communications: Bayesian Resource Allocation
Authors:
Linlong Wu,
Kumar Vijay Mishra,
Bhavani Shankar M. R.,
Björn Ottersten
Abstract:
Due to spectrum scarcity, the coexistence of radar and wireless communication has gained substantial research interest recently. Among many scenarios, the heterogeneouslydistributed joint radar-communication system is promising due to its flexibility and compatibility of existing architectures. In this paper, we focus on a heterogeneous radar and communication network (HRCN), which consists of var…
▽ More
Due to spectrum scarcity, the coexistence of radar and wireless communication has gained substantial research interest recently. Among many scenarios, the heterogeneouslydistributed joint radar-communication system is promising due to its flexibility and compatibility of existing architectures. In this paper, we focus on a heterogeneous radar and communication network (HRCN), which consists of various generic radars for multiple target tracking (MTT) and wireless communications for multiple users. We aim to improve the MTT performance and maintain good throughput levels for communication users by a well-designed resource allocation. The problem is formulated as a Bayesian Cramér-Rao bound (CRB) based minimization subjecting to resource budgets and throughput constraints. The formulated nonconvex problem is solved based on an alternating descent-ascent approach. Numerical results demonstrate the efficacy of the proposed allocation scheme for this heterogeneous network.
△ Less
Submitted 4 March, 2022; v1 submitted 29 July, 2021;
originally announced July 2021.
-
Efficient and Differentiable Shadow Computation for Inverse Problems
Authors:
Linjie Lyu,
Marc Habermann,
Lingjie Liu,
Mallikarjun B R,
Ayush Tewari,
Christian Theobalt
Abstract:
Differentiable rendering has received increasing interest for image-based inverse problems. It can benefit traditional optimization-based solutions to inverse problems, but also allows for self-supervision of learning-based approaches for which training data with ground truth annotation is hard to obtain. However, existing differentiable renderers either do not model visibility of the light source…
▽ More
Differentiable rendering has received increasing interest for image-based inverse problems. It can benefit traditional optimization-based solutions to inverse problems, but also allows for self-supervision of learning-based approaches for which training data with ground truth annotation is hard to obtain. However, existing differentiable renderers either do not model visibility of the light sources from the different points in the scene, responsible for shadows in the images, or are too slow for being used to train deep architectures over thousands of iterations. To this end, we propose an accurate yet efficient approach for differentiable visibility and soft shadow computation. Our approach is based on the spherical harmonics approximations of the scene illumination and visibility, where the occluding surface is approximated with spheres. This allows for a significantly more efficient shadow computation compared to methods based on ray tracing. As our formulation is differentiable, it can be used to solve inverse problems such as texture, illumination, rigid pose, and geometric deformation recovery from images using analysis-by-synthesis optimization.
△ Less
Submitted 1 April, 2021;
originally announced April 2021.
-
PhotoApp: Photorealistic Appearance Editing of Head Portraits
Authors:
Mallikarjun B R,
Ayush Tewari,
Abdallah Dib,
Tim Weyrich,
Bernd Bickel,
Hans-Peter Seidel,
Hanspeter Pfister,
Wojciech Matusik,
Louis Chevallier,
Mohamed Elgharib,
Christian Theobalt
Abstract:
Photorealistic editing of portraits is a challenging task as humans are very sensitive to inconsistencies in faces. We present an approach for high-quality intuitive editing of the camera viewpoint and scene illumination in a portrait image. This requires our method to capture and control the full reflectance field of the person in the image. Most editing approaches rely on supervised learning usi…
▽ More
Photorealistic editing of portraits is a challenging task as humans are very sensitive to inconsistencies in faces. We present an approach for high-quality intuitive editing of the camera viewpoint and scene illumination in a portrait image. This requires our method to capture and control the full reflectance field of the person in the image. Most editing approaches rely on supervised learning using training data captured with setups such as light and camera stages. Such datasets are expensive to acquire, not readily available and do not capture all the rich variations of in-the-wild portrait images. In addition, most supervised approaches only focus on relighting, and do not allow camera viewpoint editing. Thus, they only capture and control a subset of the reflectance field. Recently, portrait editing has been demonstrated by operating in the generative model space of StyleGAN. While such approaches do not require direct supervision, there is a significant loss of quality when compared to the supervised approaches. In this paper, we present a method which learns from limited supervised training data. The training images only include people in a fixed neutral expression with eyes closed, without much hair or background variations. Each person is captured under 150 one-light-at-a-time conditions and under 8 camera poses. Instead of training directly in the image space, we design a supervised problem which learns transformations in the latent space of StyleGAN. This combines the best of supervised learning and generative adversarial modeling. We show that the StyleGAN prior allows for generalisation to different expressions, hairstyles and backgrounds. This produces high-quality photorealistic results for in-the-wild images and significantly outperforms existing methods. Our approach can edit the illumination and pose simultaneously, and runs at interactive rates.
△ Less
Submitted 13 May, 2021; v1 submitted 13 March, 2021;
originally announced March 2021.
-
Design and implementation of Energy Efficient Lightweight Encryption (EELWE) algorithm for medical applications
Authors:
Radhika Rani Chintala,
Narasinga Rao M R,
Somu Venkateswarlu
Abstract:
Proportional to the growth in the usage of Human Sensor Networks (HSN), the volume of the data exchange between Sensor devices is increasing at a rapid pace. In this paper, we have proposed an Energy Efficient Lightweight Encryption (EELWE) algorithm for providing the confidentiality of data at the sensor level, particularly suitable for resource-constrained environments. Results obtained have pro…
▽ More
Proportional to the growth in the usage of Human Sensor Networks (HSN), the volume of the data exchange between Sensor devices is increasing at a rapid pace. In this paper, we have proposed an Energy Efficient Lightweight Encryption (EELWE) algorithm for providing the confidentiality of data at the sensor level, particularly suitable for resource-constrained environments. Results obtained have proved that an EELWE consumes less energy relative to present lightweight ciphers and it supports multiple block sizes of 32-bit, 48-bit, and 64-bit.
△ Less
Submitted 8 March, 2021;
originally announced March 2021.
-
Design and Development of Robots End Effector Test Rig
Authors:
Josephine Selvarani Ruth D,
Saniya Zeba,
Vibha M R,
Rokesh Laishram,
Gauthama Anand
Abstract:
A Test Rig for end-effectors of a robot is designed such that it achieves a prismatic motion in x-y-z axes for grasping an object. It is a structure, designed with a compact combination of sensors and actuators. Sensors are used for detecting presence, position and disturbance of target work piece or any object and actuators with motor driving system meant for controlling and moving the mechanism…
▽ More
A Test Rig for end-effectors of a robot is designed such that it achieves a prismatic motion in x-y-z axes for grasping an object. It is a structure, designed with a compact combination of sensors and actuators. Sensors are used for detecting presence, position and disturbance of target work piece or any object and actuators with motor driving system meant for controlling and moving the mechanism of the system. Hence, it improves the ergonomics and accuracy of an operation with enhanced repeatability.
△ Less
Submitted 4 January, 2021;
originally announced January 2021.
-
PeopleXploit -- A hybrid tool to collect public data
Authors:
Arjun Anand V,
Buvanasri A K,
Meenakshi R,
Karthika S,
Ashok Kumar Mohan
Abstract:
This paper introduces the concept of Open Source Intelligence (OSINT) as an important application in intelligent profiling of individuals. With a variety of tools available, significant data shall be obtained on an individual as a consequence of analyzing his/her internet presence but all of this comes at the cost of low relevance. To increase the relevance score in profiling, PeopleXploit is bein…
▽ More
This paper introduces the concept of Open Source Intelligence (OSINT) as an important application in intelligent profiling of individuals. With a variety of tools available, significant data shall be obtained on an individual as a consequence of analyzing his/her internet presence but all of this comes at the cost of low relevance. To increase the relevance score in profiling, PeopleXploit is being introduced. PeopleXploit is a hybrid tool which helps in collecting the publicly available information that is reliable and relevant to the given input. This tool is used to track and trace the given target with their digital footprints like Name, Email, Phone Number, User IDs etc. and the tool will scan & search other associated data from public available records from the internet and create a summary report against the target. PeopleXploit profiles a person using authorship analysis and finds the best matching guess. Also, the type of analysis performed (professional/matrimonial/criminal entity) varies with the requirement of the user.
△ Less
Submitted 28 October, 2020;
originally announced October 2020.
-
Learning Complete 3D Morphable Face Models from Images and Videos
Authors:
Mallikarjun B R,
Ayush Tewari,
Hans-Peter Seidel,
Mohamed Elgharib,
Christian Theobalt
Abstract:
Most 3D face reconstruction methods rely on 3D morphable models, which disentangle the space of facial deformations into identity geometry, expressions and skin reflectance. These models are typically learned from a limited number of 3D scans and thus do not generalize well across different identities and expressions. We present the first approach to learn complete 3D models of face identity geome…
▽ More
Most 3D face reconstruction methods rely on 3D morphable models, which disentangle the space of facial deformations into identity geometry, expressions and skin reflectance. These models are typically learned from a limited number of 3D scans and thus do not generalize well across different identities and expressions. We present the first approach to learn complete 3D models of face identity geometry, albedo and expression just from images and videos. The virtually endless collection of such data, in combination with our self-supervised learning-based approach allows for learning face models that generalize beyond the span of existing approaches. Our network design and loss functions ensure a disentangled parameterization of not only identity and albedo, but also, for the first time, an expression basis. Our method also allows for in-the-wild monocular reconstruction at test time. We show that our learned models better generalize and lead to higher quality image-based reconstructions than existing approaches.
△ Less
Submitted 4 October, 2020;
originally announced October 2020.
-
PIE: Portrait Image Embedding for Semantic Control
Authors:
Ayush Tewari,
Mohamed Elgharib,
Mallikarjun B R.,
Florian Bernard,
Hans-Peter Seidel,
Patrick Pérez,
Michael Zollhöfer,
Christian Theobalt
Abstract:
Editing of portrait images is a very popular and important research topic with a large variety of applications. For ease of use, control should be provided via a semantically meaningful parameterization that is akin to computer animation controls. The vast majority of existing techniques do not provide such intuitive and fine-grained control, or only enable coarse editing of a single isolated cont…
▽ More
Editing of portrait images is a very popular and important research topic with a large variety of applications. For ease of use, control should be provided via a semantically meaningful parameterization that is akin to computer animation controls. The vast majority of existing techniques do not provide such intuitive and fine-grained control, or only enable coarse editing of a single isolated control parameter. Very recently, high-quality semantically controlled editing has been demonstrated, however only on synthetically created StyleGAN images. We present the first approach for embedding real portrait images in the latent space of StyleGAN, which allows for intuitive editing of the head pose, facial expression, and scene illumination in the image. Semantic editing in parameter space is achieved based on StyleRig, a pretrained neural network that maps the control space of a 3D morphable face model to the latent space of the GAN. We design a novel hierarchical non-linear optimization problem to obtain the embedding. An identity preservation energy term allows spatially coherent edits while maintaining facial integrity. Our approach runs at interactive frame rates and thus allows the user to explore the space of possible edits. We evaluate our approach on a wide set of portrait photos, compare it to the current state of the art, and validate the effectiveness of its components in an ablation study.
△ Less
Submitted 20 September, 2020;
originally announced September 2020.
-
Monocular Reconstruction of Neural Face Reflectance Fields
Authors:
Mallikarjun B R.,
Ayush Tewari,
Tae-Hyun Oh,
Tim Weyrich,
Bernd Bickel,
Hans-Peter Seidel,
Hanspeter Pfister,
Wojciech Matusik,
Mohamed Elgharib,
Christian Theobalt
Abstract:
The reflectance field of a face describes the reflectance properties responsible for complex lighting effects including diffuse, specular, inter-reflection and self shadowing. Most existing methods for estimating the face reflectance from a monocular image assume faces to be diffuse with very few approaches adding a specular component. This still leaves out important perceptual aspects of reflecta…
▽ More
The reflectance field of a face describes the reflectance properties responsible for complex lighting effects including diffuse, specular, inter-reflection and self shadowing. Most existing methods for estimating the face reflectance from a monocular image assume faces to be diffuse with very few approaches adding a specular component. This still leaves out important perceptual aspects of reflectance as higher-order global illumination effects and self-shadowing are not modeled. We present a new neural representation for face reflectance where we can estimate all components of the reflectance responsible for the final appearance from a single monocular image. Instead of modeling each component of the reflectance separately using parametric models, our neural representation allows us to generate a basis set of faces in a geometric deformation-invariant space, parameterized by the input light direction, viewpoint and face geometry. We learn to reconstruct this reflectance field of a face just from a monocular image, which can be used to render the face from any viewpoint in any light condition. Our method is trained on a light-stage training dataset, which captures 300 people illuminated with 150 light conditions from 8 viewpoints. We show that our method outperforms existing monocular reflectance reconstruction methods, in terms of photorealism due to better capturing of physical premitives, such as sub-surface scattering, specularities, self-shadows and other higher-order effects.
△ Less
Submitted 24 August, 2020;
originally announced August 2020.
-
End-to-End Code Switching Language Models for Automatic Speech Recognition
Authors:
Ahan M. R.,
Shreyas Sunil Kulkarni
Abstract:
In this paper, we particularly work on the code-switched text, one of the most common occurrences in the bilingual communities across the world. Due to the discrepancies in the extraction of code-switched text from an Automated Speech Recognition(ASR) module, and thereby extracting the monolingual text from the code-switched text, we propose an approach for extracting monolingual text using Deep B…
▽ More
In this paper, we particularly work on the code-switched text, one of the most common occurrences in the bilingual communities across the world. Due to the discrepancies in the extraction of code-switched text from an Automated Speech Recognition(ASR) module, and thereby extracting the monolingual text from the code-switched text, we propose an approach for extracting monolingual text using Deep Bi-directional Language Models(LM) such as BERT and other Machine Translation models, and also explore different ways of extracting code-switched text from the ASR model. We also explain the robustness of the model by comparing the results of Perplexity and other different metrics like WER, to the standard bi-lingual text output without any external information.
△ Less
Submitted 15 June, 2020;
originally announced June 2020.
-
Joint User Grouping, Scheduling, and Precoding for Multicast Energy Efficiency in Multigroup Multicast Systems
Authors:
Ashok Bandi,
Bhavani Shankar Mysore R,
Symeon Chatzinotas,
Björn Ottersten
Abstract:
This paper studies the joint design of user grouping, scheduling (or admission control) and precoding to optimize energy efficiency (EE) for multigroup multicast scenarios in single-cell multiuser MISO downlink channels. Noticing that the existing definition of EE fails to account for group sizes, a new metric called multicast energy efficiency (MEE) is proposed. In this context, the joint design…
▽ More
This paper studies the joint design of user grouping, scheduling (or admission control) and precoding to optimize energy efficiency (EE) for multigroup multicast scenarios in single-cell multiuser MISO downlink channels. Noticing that the existing definition of EE fails to account for group sizes, a new metric called multicast energy efficiency (MEE) is proposed. In this context, the joint design is considered for the maximization of MEE, EE, and scheduled users. Firstly, with the help of binary variables (associated with grouping and scheduling) the joint design problem is formulated as a mixed-Boolean fractional programming problem such that it facilitates the joint update of grouping, scheduling and precoding variables. Further, several novel optimization formulations are proposed to reveal the hidden difference of convex/ concave structure in the objective and associated constraints. Thereafter, we propose a convex-concave procedure framework based iterative algorithm for each optimization criteria where grouping, scheduling, and precoding variables are updated jointly in each iteration. Finally, we compare the performance of the three design criteria concerning three performance metrics namely MEE, EE, and scheduled users through Monte-Carlo simulations. These simulations establish the need for MEE and the improvement from the system optimization.
△ Less
Submitted 14 May, 2020;
originally announced May 2020.
-
Team Mountaineers Space Robotic Challenge Phase-2 Qualification Round Preparation Report
Authors:
Cagri Kilic,
Christopher A. Tatsch,
Bernardo Martinez R. Jr,
Jared J. Beard,
Derek W. Ross,
Jason N. Gross
Abstract:
Team Mountaineers launched efforts on the NASA Space Robotics Challenge Phase-2 (SRC2). The challenge will be held on the lunar terrain with virtual robotic platforms to establish an in-situ resource utilization process. In this report, we provide an overview of a simulation environment, a virtual mobile robot, and a software architecture that was created by Team Mountaineers in order to prepare f…
▽ More
Team Mountaineers launched efforts on the NASA Space Robotics Challenge Phase-2 (SRC2). The challenge will be held on the lunar terrain with virtual robotic platforms to establish an in-situ resource utilization process. In this report, we provide an overview of a simulation environment, a virtual mobile robot, and a software architecture that was created by Team Mountaineers in order to prepare for the competition's qualification round before the competition environment was released.
△ Less
Submitted 22 March, 2020;
originally announced March 2020.
-
On Computing the Hamiltonian Index of Graphs
Authors:
Geevarghese Philip,
Rani M. R.,
Subashini R
Abstract:
The $r$-th iterated line graph $L^{r}(G)$ of a graph $G$ is defined by: (i) $L^{0}(G) = G$ and (ii) $L^{r}(G) = L(L^{(r- 1)}(G))$ for $r > 0$, where $L(G)$ denotes the line graph of $G$. The Hamiltonian Index $h(G)$ of $G$ is the smallest $r$ such that $L^{r}(G)$ has a Hamiltonian cycle. Checking if $h(G) = k$ is NP-hard for any fixed integer $k \geq 0$ even for subcubic graphs $G$. We study the p…
▽ More
The $r$-th iterated line graph $L^{r}(G)$ of a graph $G$ is defined by: (i) $L^{0}(G) = G$ and (ii) $L^{r}(G) = L(L^{(r- 1)}(G))$ for $r > 0$, where $L(G)$ denotes the line graph of $G$. The Hamiltonian Index $h(G)$ of $G$ is the smallest $r$ such that $L^{r}(G)$ has a Hamiltonian cycle. Checking if $h(G) = k$ is NP-hard for any fixed integer $k \geq 0$ even for subcubic graphs $G$. We study the parameterized complexity of this problem with the parameter treewidth, $tw(G)$, and show that we can find $h(G)$ in time $O*((1 + 2^{(ω+ 3)})^{tw(G)})$ where $ω$ is the matrix multiplication exponent and the $O*$ notation hides polynomial factors in input size.
The NP-hard Eulerian Steiner Subgraph problem takes as input a graph $G$ and a specified subset $K$ of terminal vertices of $G$ and asks if $G$ has an Eulerian (that is: connected, and with all vertices of even degree.) subgraph $H$ containing all the terminals. A second result (and a key ingredient of our algorithm for finding $h(G)$) in this work is an algorithm which solves Eulerian Steiner Subgraph in $O*((1 + 2^{(ω+ 3)})^{tw(G)})$ time.
△ Less
Submitted 4 December, 2019;
originally announced December 2019.
-
Adaptive Artificial Intelligent Q&A Platform
Authors:
M. R,
Akram,
C. P,
Singhabahu,
M. S. M Saad,
P,
Deleepa,
Anupiya,
Nugaliyadde,
Yashas,
Mallawarachchi
Abstract:
The paper presents an approach to build a question and answer system that is capable of processing the information in a large dataset and allows the user to gain knowledge from this dataset by asking questions in natural language form. Key content of this research covers four dimensions which are; Corpus Preprocessing, Question Preprocessing, Deep Neural Network for Answer Extraction and Answer Ge…
▽ More
The paper presents an approach to build a question and answer system that is capable of processing the information in a large dataset and allows the user to gain knowledge from this dataset by asking questions in natural language form. Key content of this research covers four dimensions which are; Corpus Preprocessing, Question Preprocessing, Deep Neural Network for Answer Extraction and Answer Generation. The system is capable of understanding the question, responds to the user's query in natural language form as well. The goal is to make the user feel as if they were interacting with a person than a machine.
△ Less
Submitted 19 January, 2019;
originally announced February 2019.
-
RNNSecureNet: Recurrent neural networks for Cyber security use-cases
Authors:
Mohammed Harun Babu R,
Vinayakumar R,
Soman KP
Abstract:
Recurrent neural network (RNN) is an effective neural network in solving very complex supervised and unsupervised tasks. There has been a significant improvement in RNN field such as natural language processing, speech processing, computer vision and other multiple domains. This paper deals with RNN application on different use cases like Incident Detection, Fraud Detection, and Android Malware Cl…
▽ More
Recurrent neural network (RNN) is an effective neural network in solving very complex supervised and unsupervised tasks. There has been a significant improvement in RNN field such as natural language processing, speech processing, computer vision and other multiple domains. This paper deals with RNN application on different use cases like Incident Detection, Fraud Detection, and Android Malware Classification. The best performing neural network architecture is chosen by conducting different chain of experiments for different network parameters and structures. The network is run up to 1000 epochs with learning rate set in the range of 0.01 to 0.5.Obviously, RNN performed very well when compared to classical machine learning algorithms. This is mainly possible because RNNs implicitly extracts the underlying features and also identifies the characteristics of the data. This helps to achieve better accuracy.
△ Less
Submitted 5 January, 2019;
originally announced January 2019.
-
A short review on Applications of Deep learning for Cyber security
Authors:
Mohammed Harun Babu R,
Vinayakumar R,
Soman KP
Abstract:
Deep learning is an advanced model of traditional machine learning. This has the capability to extract optimal feature representation from raw input samples. This has been applied towards various use cases in cyber security such as intrusion detection, malware classification, android malware detection, spam and phishing detection and binary analysis. This paper outlines the survey of all the works…
▽ More
Deep learning is an advanced model of traditional machine learning. This has the capability to extract optimal feature representation from raw input samples. This has been applied towards various use cases in cyber security such as intrusion detection, malware classification, android malware detection, spam and phishing detection and binary analysis. This paper outlines the survey of all the works related to deep learning based solutions for various cyber security use cases. Keywords: Deep learning, intrusion detection, malware detection, Android malware detection, spam & phishing detection, traffic analysis, binary analysis.
△ Less
Submitted 29 January, 2019; v1 submitted 15 December, 2018;
originally announced December 2018.
-
Multi-layer Pruning Framework for Compressing Single Shot MultiBox Detector
Authors:
Pravendra Singh,
Manikandan R,
Neeraj Matiyali,
Vinay P. Namboodiri
Abstract:
We propose a framework for compressing state-of-the-art Single Shot MultiBox Detector (SSD). The framework addresses compression in the following stages: Sparsity Induction, Filter Selection, and Filter Pruning. In the Sparsity Induction stage, the object detector model is sparsified via an improved global threshold. In Filter Selection & Pruning stage, we select and remove filters using sparsity…
▽ More
We propose a framework for compressing state-of-the-art Single Shot MultiBox Detector (SSD). The framework addresses compression in the following stages: Sparsity Induction, Filter Selection, and Filter Pruning. In the Sparsity Induction stage, the object detector model is sparsified via an improved global threshold. In Filter Selection & Pruning stage, we select and remove filters using sparsity statistics of filter weights in two consecutive convolutional layers. This results in the model with the size smaller than most existing compact architectures. We evaluate the performance of our framework with multiple datasets and compare over multiple methods. Experimental results show that our method achieves state-of-the-art compression of 6.7X and 4.9X on PASCAL VOC dataset on models SSD300 and SSD512 respectively. We further show that the method produces maximum compression of 26X with SSD512 on German Traffic Sign Detection Benchmark (GTSDB). Additionally, we also empirically show our method's adaptability for classification based architecture VGG16 on datasets CIFAR and German Traffic Sign Recognition Benchmark (GTSRB) achieving a compression rate of 125X and 200X with the reduction in flops by 90.50% and 96.6% respectively with no loss of accuracy. In addition to this, our method does not require any special libraries or hardware support for the resulting compressed models.
△ Less
Submitted 20 November, 2018;
originally announced November 2018.
-
Building a Word Segmenter for Sanskrit Overnight
Authors:
Vikas Reddy,
Amrith Krishna,
Vishnu Dutt Sharma,
Prateek Gupta,
Vineeth M R,
Pawan Goyal
Abstract:
There is an abundance of digitised texts available in Sanskrit. However, the word segmentation task in such texts are challenging due to the issue of 'Sandhi'. In Sandhi, words in a sentence often fuse together to form a single chunk of text, where the word delimiter vanishes and sounds at the word boundaries undergo transformations, which is also reflected in the written text. Here, we propose an…
▽ More
There is an abundance of digitised texts available in Sanskrit. However, the word segmentation task in such texts are challenging due to the issue of 'Sandhi'. In Sandhi, words in a sentence often fuse together to form a single chunk of text, where the word delimiter vanishes and sounds at the word boundaries undergo transformations, which is also reflected in the written text. Here, we propose an approach that uses a deep sequence to sequence (seq2seq) model that takes only the sandhied string as the input and predicts the unsandhied string. The state of the art models are linguistically involved and have external dependencies for the lexical and morphological analysis of the input. Our model can be trained "overnight" and be used for production. In spite of the knowledge lean approach, our system preforms better than the current state of the art by gaining a percentage increase of 16.79 % than the current state of the art.
△ Less
Submitted 16 February, 2018;
originally announced February 2018.
-
Pengaruh Perangkat Server Terhadap Kualitas Pengontrolan Jarak Jauh Melalui Internet
Authors:
Gunawan,
Imam Muslim R
Abstract:
Internet greatly assist people in improving their quality of life. Almost all areas of human life can be accessed using the internet. Human aided by the internet that provides all sorts of information that they need. Along with the development of the Internet network infrastructure remotely control began to change using the internet. In this study using notebooks and servers Raspberry Pi to find o…
▽ More
Internet greatly assist people in improving their quality of life. Almost all areas of human life can be accessed using the internet. Human aided by the internet that provides all sorts of information that they need. Along with the development of the Internet network infrastructure remotely control began to change using the internet. In this study using notebooks and servers Raspberry Pi to find out the quality control of each device server used. In this study we investigate the possibility of improving the quality of web-based remote control to implement Raspberry Pi as a web server and how much improvement the quality of web-based remote control obtained in this research.
△ Less
Submitted 1 October, 2017;
originally announced October 2017.
-
ASTROMLSKIT: A New Statistical Machine Learning Toolkit: A Platform for Data Analytics in Astronomy
Authors:
Snehanshu Saha,
Surbhi Agrawal,
Manikandan. R,
Kakoli Bora,
Swati Routh,
Anand Narasimhamurthy
Abstract:
Astroinformatics is a new impact area in the world of astronomy, occasionally called the final frontier, where several astrophysicists, statisticians and computer scientists work together to tackle various data intensive astronomical problems. Exponential growth in the data volume and increased complexity of the data augments difficult questions to the existing challenges. Classical problems in As…
▽ More
Astroinformatics is a new impact area in the world of astronomy, occasionally called the final frontier, where several astrophysicists, statisticians and computer scientists work together to tackle various data intensive astronomical problems. Exponential growth in the data volume and increased complexity of the data augments difficult questions to the existing challenges. Classical problems in Astronomy are compounded by accumulation of astronomical volume of complex data, rendering the task of classification and interpretation incredibly laborious. The presence of noise in the data makes analysis and interpretation even more arduous. Machine learning algorithms and data analytic techniques provide the right platform for the challenges posed by these problems. A diverse range of open problem like star-galaxy separation, detection and classification of exoplanets, classification of supernovae is discussed. The focus of the paper is the applicability and efficacy of various machine learning algorithms like K Nearest Neighbor (KNN), random forest (RF), decision tree (DT), Support Vector Machine (SVM), Naïve Bayes and Linear Discriminant Analysis (LDA) in analysis and inference of the decision theoretic problems in Astronomy. The machine learning algorithms, integrated into ASTROMLSKIT, a toolkit developed in the course of the work, have been used to analyze HabCat data and supernovae data. Accuracy has been found to be appreciably good.
△ Less
Submitted 29 April, 2015;
originally announced April 2015.
-
Interference Mitigating Satellite Broadcast Receiver using Reduced Complexity List-Based Detection in Correlated Noise
Authors:
Zohair Abu-Shaban,
Hani Mehrpouyan,
Bhavani Shankar M. R.,
Bjorn Ottersten
Abstract:
The recent commercial trends towards using smaller dish antennas for satellite receivers, and the growing density of broadcasting satellites, necessitate the application of robust adjacent satellite interference (ASI) cancellation schemes. This orbital density growth along with the wider beamwidth of a smaller dish have imposed an overloaded scenario at the satellite receiver, where the number of…
▽ More
The recent commercial trends towards using smaller dish antennas for satellite receivers, and the growing density of broadcasting satellites, necessitate the application of robust adjacent satellite interference (ASI) cancellation schemes. This orbital density growth along with the wider beamwidth of a smaller dish have imposed an overloaded scenario at the satellite receiver, where the number of transmitting satellites exceeds the number of receiving elements at the dish antenna. To ensure successful operation in this practical scenario, we propose a satellite receiver that enhances signal detection from the desired satellite by mitigating the interference from neighboring satellites. Towards this objective, we propose a reduced complexity list-based group-wise search detection (RC-LGSD) receiver under the assumption of spatially correlated additive noise. To further enhance detection performance, the proposed satellite receiver utilizes a newly designed whitening filter to remove the spatial correlation amongst the noise parameters, while also applying a preprocessor that maximizes the signal-to-interference-plus-noise ratio (SINR). Extensive simulations under practical scenarios show that the proposed receiver enhances the performance of satellite broadcast systems in the presence of ASI compared to existing methods.
△ Less
Submitted 25 April, 2014;
originally announced April 2014.
-
Enhanced List-Based Group-Wise Overloaded Receiver with Application to Satellite Reception
Authors:
Zohair Abu-Shaban,
Bhavani Shankar M. R,
Hani Mehrpouyan,
Bjorn Ottersten
Abstract:
The market trends towards the use of smaller dish antennas for TV satellite receivers, as well as the growing density of broadcasting satellites in orbit require the application of robust adjacent satellite interference (ASI) cancellation algorithms at the receivers. The wider beamwidth of a small size dish and the growing number of satellites in orbit impose an overloaded scenario, i.e., a scenar…
▽ More
The market trends towards the use of smaller dish antennas for TV satellite receivers, as well as the growing density of broadcasting satellites in orbit require the application of robust adjacent satellite interference (ASI) cancellation algorithms at the receivers. The wider beamwidth of a small size dish and the growing number of satellites in orbit impose an overloaded scenario, i.e., a scenario where the number of transmitting satellites exceeds the number of receiving antennas. For such a scenario, we present a two stage receiver to enhance signal detection from the satellite of interest, i.e., the satellite that the dish is pointing to, while reducing interference from neighboring satellites. Towards this objective, we propose an enhanced List-based Group-wise Search Detection (LGSD) receiver architecture that takes into account the spatially correlated additive noise and uses the signal-to-interference-plus noise ratio (SINR) maximization criterion to improve detection performance. Simulations show that the proposed receiver structure enhances the performance of satellite systems in the presence of ASI when compared to existing methods.
△ Less
Submitted 17 April, 2014;
originally announced April 2014.
-
A Survey on Mobile Data Gathering in Wireless Sensor Networks - Bounded Relay
Authors:
Ms. Rubia. R,
Mr. SivanArulSelvan
Abstract:
Most of the wireless sensor networks consist of static sensors, which can be deployed in a wide environment for monitoring applications. While transmitting the data from source to static sink, the amount of energy consumption of the sensor node is high. It results in reduced lifetime of the network.Some of the WSN architectures have been proposed based on Mobile Elements. There is large number of…
▽ More
Most of the wireless sensor networks consist of static sensors, which can be deployed in a wide environment for monitoring applications. While transmitting the data from source to static sink, the amount of energy consumption of the sensor node is high. It results in reduced lifetime of the network.Some of the WSN architectures have been proposed based on Mobile Elements. There is large number of approaches to resolve the above problem. It is found those two approaches, namely Single Hop Data Gathering problem (SHDGP) and mobile Data Gathering, which is used to increase the lifetime of the network. Single Hop Data Gathering Problem is used to achieve the uniform energy consumption. The mobile Data Gathering algorithm is used to find the minimal set of points in the sensor network, which serves as data gathering points for mobile network. Even after so many decades of research, there are some unresolved problems like non uniform energy consumption, increased latency, which needs to be resolved.
△ Less
Submitted 6 February, 2014;
originally announced February 2014.