-
Knowledge Distillation for Enhancing Walmart E-commerce Search Relevance Using Large Language Models
Authors:
Hongwei Shang,
Nguyen Vo,
Nitin Yadav,
Tian Zhang,
Ajit Puthenputhussery,
Xunfan Cai,
Shuyi Chen,
Prijith Chandran,
Changsung Kang
Abstract:
Ensuring the products displayed in e-commerce search results are relevant to users queries is crucial for improving the user experience. With their advanced semantic understanding, deep learning models have been widely used for relevance matching in search tasks. While large language models (LLMs) offer superior ranking capabilities, it is challenging to deploy LLMs in real-time systems due to the…
▽ More
Ensuring the products displayed in e-commerce search results are relevant to users queries is crucial for improving the user experience. With their advanced semantic understanding, deep learning models have been widely used for relevance matching in search tasks. While large language models (LLMs) offer superior ranking capabilities, it is challenging to deploy LLMs in real-time systems due to the high-latency requirements. To leverage the ranking power of LLMs while meeting the low-latency demands of production systems, we propose a novel framework that distills a high performing LLM into a more efficient, low-latency student model. To help the student model learn more effectively from the teacher model, we first train the teacher LLM as a classification model with soft targets. Then, we train the student model to capture the relevance margin between pairs of products for a given query using mean squared error loss. Instead of using the same training data as the teacher model, we significantly expand the student model dataset by generating unlabeled data and labeling it with the teacher model predictions. Experimental results show that the student model performance continues to improve as the size of the augmented training data increases. In fact, with enough augmented data, the student model can outperform the teacher model. The student model has been successfully deployed in production at Walmart.com with significantly positive metrics.
△ Less
Submitted 11 May, 2025;
originally announced May 2025.
-
Spline-based Transformers
Authors:
Prashanth Chandran,
Agon Serifi,
Markus Gross,
Moritz Bächer
Abstract:
We introduce Spline-based Transformers, a novel class of Transformer models that eliminate the need for positional encoding. Inspired by workflows using splines in computer animation, our Spline-based Transformers embed an input sequence of elements as a smooth trajectory in latent space. Overcoming drawbacks of positional encoding such as sequence length extrapolation, Spline-based Transformers a…
▽ More
We introduce Spline-based Transformers, a novel class of Transformer models that eliminate the need for positional encoding. Inspired by workflows using splines in computer animation, our Spline-based Transformers embed an input sequence of elements as a smooth trajectory in latent space. Overcoming drawbacks of positional encoding such as sequence length extrapolation, Spline-based Transformers also provide a novel way for users to interact with transformer latent spaces by directly manipulating the latent control points to create new latent trajectories and sequences. We demonstrate the superior performance of our approach in comparison to conventional positional encoding on a variety of datasets, ranging from synthetic 2D to large-scale real-world datasets of images, 3D shapes, and animations.
△ Less
Submitted 3 April, 2025;
originally announced April 2025.
-
Joint Learning of Depth and Appearance for Portrait Image Animation
Authors:
Xinya Ji,
Gaspard Zoss,
Prashanth Chandran,
Lingchen Yang,
Xun Cao,
Barbara Solenthaler,
Derek Bradley
Abstract:
2D portrait animation has experienced significant advancements in recent years. Much research has utilized the prior knowledge embedded in large generative diffusion models to enhance high-quality image manipulation. However, most methods only focus on generating RGB images as output, and the co-generation of consistent visual plus 3D output remains largely under-explored. In our work, we propose…
▽ More
2D portrait animation has experienced significant advancements in recent years. Much research has utilized the prior knowledge embedded in large generative diffusion models to enhance high-quality image manipulation. However, most methods only focus on generating RGB images as output, and the co-generation of consistent visual plus 3D output remains largely under-explored. In our work, we propose to jointly learn the visual appearance and depth simultaneously in a diffusion-based portrait image generator. Our method embraces the end-to-end diffusion paradigm and introduces a new architecture suitable for learning this conditional joint distribution, consisting of a reference network and a channel-expanded diffusion backbone. Once trained, our framework can be efficiently adapted to various downstream applications, such as facial depth-to-image and image-to-depth generation, portrait relighting, and audio-driven talking head animation with consistent 3D output.
△ Less
Submitted 15 January, 2025;
originally announced January 2025.
-
Monocular Facial Appearance Capture in the Wild
Authors:
Yingyan Xu,
Kate Gadola,
Prashanth Chandran,
Sebastian Weiss,
Markus Gross,
Gaspard Zoss,
Derek Bradley
Abstract:
We present a new method for reconstructing the appearance properties of human faces from a lightweight capture procedure in an unconstrained environment. Our method recovers the surface geometry, diffuse albedo, specular intensity and specular roughness from a monocular video containing a simple head rotation in-the-wild. Notably, we make no simplifying assumptions on the environment lighting, and…
▽ More
We present a new method for reconstructing the appearance properties of human faces from a lightweight capture procedure in an unconstrained environment. Our method recovers the surface geometry, diffuse albedo, specular intensity and specular roughness from a monocular video containing a simple head rotation in-the-wild. Notably, we make no simplifying assumptions on the environment lighting, and we explicitly take visibility and occlusions into account. As a result, our method can produce facial appearance maps that approach the fidelity of studio-based multi-view captures, but with a far easier and cheaper procedure.
△ Less
Submitted 17 December, 2024;
originally announced December 2024.
-
Enhancing Relevance of Embedding-based Retrieval at Walmart
Authors:
Juexin Lin,
Sachin Yadav,
Feng Liu,
Nicholas Rossi,
Praveen R. Suram,
Satya Chembolu,
Prijith Chandran,
Hrushikesh Mohapatra,
Tony Lee,
Alessandro Magnani,
Ciya Liao
Abstract:
Embedding-based neural retrieval (EBR) is an effective search retrieval method in product search for tackling the vocabulary gap between customer search queries and products. The initial launch of our EBR system at Walmart yielded significant gains in relevance and add-to-cart rates [1]. However, despite EBR generally retrieving more relevant products for reranking, we have observed numerous insta…
▽ More
Embedding-based neural retrieval (EBR) is an effective search retrieval method in product search for tackling the vocabulary gap between customer search queries and products. The initial launch of our EBR system at Walmart yielded significant gains in relevance and add-to-cart rates [1]. However, despite EBR generally retrieving more relevant products for reranking, we have observed numerous instances of relevance degradation. Enhancing retrieval performance is crucial, as it directly influences product reranking and affects the customer shopping experience. Factors contributing to these degradations include false positives/negatives in the training data and the inability to handle query misspellings. To address these issues, we present several approaches to further strengthen the capabilities of our EBR model in terms of retrieval relevance. We introduce a Relevance Reward Model (RRM) based on human relevance feedback. We utilize RRM to remove noise from the training data and distill it into our EBR model through a multi-objective loss. In addition, we present the techniques to increase the performance of our EBR model, such as typo-aware training, and semi-positive generation. The effectiveness of our EBR is demonstrated through offline relevance evaluation, online AB tests, and successful deployments to live production.
[1] Alessandro Magnani, Feng Liu, Suthee Chaidaroon, Sachin Yadav, Praveen Reddy Suram, Ajit Puthenputhussery, Sijie Chen, Min Xie, Anirudh Kashi, Tony Lee, et al. 2022. Semantic retrieval at walmart. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 3495-3503.
△ Less
Submitted 14 August, 2024; v1 submitted 9 August, 2024;
originally announced August 2024.
-
Multimodal Conditional 3D Face Geometry Generation
Authors:
Christopher Otto,
Prashanth Chandran,
Sebastian Weiss,
Markus Gross,
Gaspard Zoss,
Derek Bradley
Abstract:
We present a new method for multimodal conditional 3D face geometry generation that allows user-friendly control over the output identity and expression via a number of different conditioning signals. Within a single model, we demonstrate 3D faces generated from artistic sketches, 2D face landmarks, Canny edges, FLAME face model parameters, portrait photos, or text prompts. Our approach is based o…
▽ More
We present a new method for multimodal conditional 3D face geometry generation that allows user-friendly control over the output identity and expression via a number of different conditioning signals. Within a single model, we demonstrate 3D faces generated from artistic sketches, 2D face landmarks, Canny edges, FLAME face model parameters, portrait photos, or text prompts. Our approach is based on a diffusion process that generates 3D geometry in a 2D parameterized UV domain. Geometry generation passes each conditioning signal through a set of cross-attention layers (IP-Adapter), one set for each user-defined conditioning signal. The result is an easy-to-use 3D face generation tool that produces high resolution geometry with fine-grain user control.
△ Less
Submitted 1 July, 2024;
originally announced July 2024.
-
Large Language Models for Relevance Judgment in Product Search
Authors:
Navid Mehrdad,
Hrushikesh Mohapatra,
Mossaab Bagdouri,
Prijith Chandran,
Alessandro Magnani,
Xunfan Cai,
Ajit Puthenputhussery,
Sachin Yadav,
Tony Lee,
ChengXiang Zhai,
Ciya Liao
Abstract:
High relevance of retrieved and re-ranked items to the search query is the cornerstone of successful product search, yet measuring relevance of items to queries is one of the most challenging tasks in product information retrieval, and quality of product search is highly influenced by the precision and scale of available relevance-labelled data. In this paper, we present an array of techniques for…
▽ More
High relevance of retrieved and re-ranked items to the search query is the cornerstone of successful product search, yet measuring relevance of items to queries is one of the most challenging tasks in product information retrieval, and quality of product search is highly influenced by the precision and scale of available relevance-labelled data. In this paper, we present an array of techniques for leveraging Large Language Models (LLMs) for automating the relevance judgment of query-item pairs (QIPs) at scale. Using a unique dataset of multi-million QIPs, annotated by human evaluators, we test and optimize hyper parameters for finetuning billion-parameter LLMs with and without Low Rank Adaption (LoRA), as well as various modes of item attribute concatenation and prompting in LLM finetuning, and consider trade offs in item attribute inclusion for quality of relevance predictions. We demonstrate considerable improvement over baselines of prior generations of LLMs, as well as off-the-shelf models, towards relevance annotations on par with the human relevance evaluators. Our findings have immediate implications for the growing field of relevance judgment automation in product search.
△ Less
Submitted 16 July, 2024; v1 submitted 31 May, 2024;
originally announced June 2024.
-
Infinite 3D Landmarks: Improving Continuous 2D Facial Landmark Detection
Authors:
Prashanth Chandran,
Gaspard Zoss,
Paulo Gotardo,
Derek Bradley
Abstract:
In this paper, we examine 3 important issues in the practical use of state-of-the-art facial landmark detectors and show how a combination of specific architectural modifications can directly improve their accuracy and temporal stability. First, many facial landmark detectors require face normalization as a preprocessing step, which is accomplished by a separately-trained neural network that crops…
▽ More
In this paper, we examine 3 important issues in the practical use of state-of-the-art facial landmark detectors and show how a combination of specific architectural modifications can directly improve their accuracy and temporal stability. First, many facial landmark detectors require face normalization as a preprocessing step, which is accomplished by a separately-trained neural network that crops and resizes the face in the input image. There is no guarantee that this pre-trained network performs the optimal face normalization for landmark detection. We instead analyze the use of a spatial transformer network that is trained alongside the landmark detector in an unsupervised manner, and jointly learn optimal face normalization and landmark detection. Second, we show that modifying the output head of the landmark predictor to infer landmarks in a canonical 3D space can further improve accuracy. To convert the predicted 3D landmarks into screen-space, we additionally predict the camera intrinsics and head pose from the input image. As a side benefit, this allows to predict the 3D face shape from a given image only using 2D landmarks as supervision, which is useful in determining landmark visibility among other things. Finally, when training a landmark detector on multiple datasets at the same time, annotation inconsistencies across datasets forces the network to produce a suboptimal average. We propose to add a semantic correction network to address this issue. This additional lightweight neural network is trained alongside the landmark detector, without requiring any additional supervision. While the insights of this paper can be applied to most common landmark detectors, we specifically target a recently-proposed continuous 2D landmark detector to demonstrate how each of our additions leads to meaningful improvements over the state-of-the-art on standard benchmarks.
△ Less
Submitted 30 May, 2024;
originally announced May 2024.
-
Learning a Generalized Physical Face Model From Data
Authors:
Lingchen Yang,
Gaspard Zoss,
Prashanth Chandran,
Markus Gross,
Barbara Solenthaler,
Eftychios Sifakis,
Derek Bradley
Abstract:
Physically-based simulation is a powerful approach for 3D facial animation as the resulting deformations are governed by physical constraints, allowing to easily resolve self-collisions, respond to external forces and perform realistic anatomy edits. Today's methods are data-driven, where the actuations for finite elements are inferred from captured skin geometry. Unfortunately, these approaches h…
▽ More
Physically-based simulation is a powerful approach for 3D facial animation as the resulting deformations are governed by physical constraints, allowing to easily resolve self-collisions, respond to external forces and perform realistic anatomy edits. Today's methods are data-driven, where the actuations for finite elements are inferred from captured skin geometry. Unfortunately, these approaches have not been widely adopted due to the complexity of initializing the material space and learning the deformation model for each character separately, which often requires a skilled artist followed by lengthy network training. In this work, we aim to make physics-based facial animation more accessible by proposing a generalized physical face model that we learn from a large 3D face dataset. Once trained, our model can be quickly fit to any unseen identity and produce a ready-to-animate physical face model automatically. Fitting is as easy as providing a single 3D face scan, or even a single face image. After fitting, we offer intuitive animation controls, as well as the ability to retarget animations across characters. All the while, the resulting animations allow for physical effects like collision avoidance, gravity, paralysis, bone reshaping and more.
△ Less
Submitted 3 September, 2024; v1 submitted 29 February, 2024;
originally announced February 2024.
-
An Implicit Physical Face Model Driven by Expression and Style
Authors:
Lingchen Yang,
Gaspard Zoss,
Prashanth Chandran,
Paulo Gotardo,
Markus Gross,
Barbara Solenthaler,
Eftychios Sifakis,
Derek Bradley
Abstract:
3D facial animation is often produced by manipulating facial deformation models (or rigs), that are traditionally parameterized by expression controls. A key component that is usually overlooked is expression 'style', as in, how a particular expression is performed. Although it is common to define a semantic basis of expressions that characters can perform, most characters perform each expression…
▽ More
3D facial animation is often produced by manipulating facial deformation models (or rigs), that are traditionally parameterized by expression controls. A key component that is usually overlooked is expression 'style', as in, how a particular expression is performed. Although it is common to define a semantic basis of expressions that characters can perform, most characters perform each expression in their own style. To date, style is usually entangled with the expression, and it is not possible to transfer the style of one character to another when considering facial animation. We present a new face model, based on a data-driven implicit neural physics model, that can be driven by both expression and style separately. At the core, we present a framework for learning implicit physics-based actuations for multiple subjects simultaneously, trained on a few arbitrary performance capture sequences from a small set of identities. Once trained, our method allows generalized physics-based facial animation for any of the trained identities, extending to unseen performances. Furthermore, it grants control over the animation style, enabling style transfer from one character to another or blending styles of different characters. Lastly, as a physics-based model, it is capable of synthesizing physical effects, such as collision handling, setting our method apart from conventional approaches.
△ Less
Submitted 27 January, 2024;
originally announced January 2024.
-
Anatomically Constrained Implicit Face Models
Authors:
Prashanth Chandran,
Gaspard Zoss
Abstract:
Coordinate based implicit neural representations have gained rapid popularity in recent years as they have been successfully used in image, geometry and scene modeling tasks. In this work, we present a novel use case for such implicit representations in the context of learning anatomically constrained face models. Actor specific anatomically constrained face models are the state of the art in both…
▽ More
Coordinate based implicit neural representations have gained rapid popularity in recent years as they have been successfully used in image, geometry and scene modeling tasks. In this work, we present a novel use case for such implicit representations in the context of learning anatomically constrained face models. Actor specific anatomically constrained face models are the state of the art in both facial performance capture and performance retargeting. Despite their practical success, these anatomical models are slow to evaluate and often require extensive data capture to be built. We propose the anatomical implicit face model; an ensemble of implicit neural networks that jointly learn to model the facial anatomy and the skin surface with high-fidelity, and can readily be used as a drop in replacement to conventional blendshape models. Given an arbitrary set of skin surface meshes of an actor and only a neutral shape with estimated skull and jaw bones, our method can recover a dense anatomical substructure which constrains every point on the facial surface. We demonstrate the usefulness of our approach in several tasks ranging from shape fitting, shape editing, and performance retargeting.
△ Less
Submitted 12 December, 2023;
originally announced December 2023.
-
Artist-Friendly Relightable and Animatable Neural Heads
Authors:
Yingyan Xu,
Prashanth Chandran,
Sebastian Weiss,
Markus Gross,
Gaspard Zoss,
Derek Bradley
Abstract:
An increasingly common approach for creating photo-realistic digital avatars is through the use of volumetric neural fields. The original neural radiance field (NeRF) allowed for impressive novel view synthesis of static heads when trained on a set of multi-view images, and follow up methods showed that these neural representations can be extended to dynamic avatars. Recently, new variants also su…
▽ More
An increasingly common approach for creating photo-realistic digital avatars is through the use of volumetric neural fields. The original neural radiance field (NeRF) allowed for impressive novel view synthesis of static heads when trained on a set of multi-view images, and follow up methods showed that these neural representations can be extended to dynamic avatars. Recently, new variants also surpassed the usual drawback of baked-in illumination in neural representations, showing that static neural avatars can be relit in any environment. In this work we simultaneously tackle both the motion and illumination problem, proposing a new method for relightable and animatable neural heads. Our method builds on a proven dynamic avatar approach based on a mixture of volumetric primitives, combined with a recently-proposed lightweight hardware setup for relightable neural fields, and includes a novel architecture that allows relighting dynamic neural avatars performing unseen expressions in any environment, even with nearfield illumination and viewpoints.
△ Less
Submitted 6 December, 2023;
originally announced December 2023.
-
A Perceptual Shape Loss for Monocular 3D Face Reconstruction
Authors:
Christopher Otto,
Prashanth Chandran,
Gaspard Zoss,
Markus Gross,
Paulo Gotardo,
Derek Bradley
Abstract:
Monocular 3D face reconstruction is a wide-spread topic, and existing approaches tackle the problem either through fast neural network inference or offline iterative reconstruction of face geometry. In either case carefully-designed energy functions are minimized, commonly including loss terms like a photometric loss, a landmark reprojection loss, and others. In this work we propose a new loss fun…
▽ More
Monocular 3D face reconstruction is a wide-spread topic, and existing approaches tackle the problem either through fast neural network inference or offline iterative reconstruction of face geometry. In either case carefully-designed energy functions are minimized, commonly including loss terms like a photometric loss, a landmark reprojection loss, and others. In this work we propose a new loss function for monocular face capture, inspired by how humans would perceive the quality of a 3D face reconstruction given a particular image. It is widely known that shading provides a strong indicator for 3D shape in the human visual system. As such, our new 'perceptual' shape loss aims to judge the quality of a 3D face estimate using only shading cues. Our loss is implemented as a discriminator-style neural network that takes an input face image and a shaded render of the geometry estimate, and then predicts a score that perceptually evaluates how well the shaded render matches the given image. This 'critic' network operates on the RGB image and geometry render alone, without requiring an estimate of the albedo or illumination in the scene. Furthermore, our loss operates entirely in image space and is thus agnostic to mesh topology. We show how our new perceptual shape loss can be combined with traditional energy terms for monocular 3D face optimization and deep neural network regression, improving upon current state-of-the-art results.
△ Less
Submitted 30 October, 2023;
originally announced October 2023.
-
On the Simulation of Hypervisor Instructions for Accurate Timing Simulation of Virtualized Systems
Authors:
Swapneel C. Mhatre,
Priya Chandran
Abstract:
Architectural simulators help in better understanding the behaviour of existing architectures and the design of new architectures. Virtualization has regained importance and this has put a pressing demand for the simulation of virtualized systems. Existing full-system simulators for virtualized systems simulate the application program instructions and the operating system instructions but abstract…
▽ More
Architectural simulators help in better understanding the behaviour of existing architectures and the design of new architectures. Virtualization has regained importance and this has put a pressing demand for the simulation of virtualized systems. Existing full-system simulators for virtualized systems simulate the application program instructions and the operating system instructions but abstract the hypercalls or traps to the hypervisor. This leads to inaccuracy in the simulation. This paper proposes an approach to simulate hypervisor instructions in addition to operating system instructions for accurate timing simulation of virtualized systems. The proposed approach is demonstrated by simulating RISC-V binary instructions. The simulator is an execution-driven, functional-first, hardware-based simulator coded in Verilog. The paper concludes that the proposed approach leads to accurate timing simulation of virtualized systems.
△ Less
Submitted 1 June, 2022;
originally announced June 2022.
-
Communication Optimization in Large Scale Federated Learning using Autoencoder Compressed Weight Updates
Authors:
Srikanth Chandar,
Pravin Chandran,
Raghavendra Bhat,
Avinash Chakravarthi
Abstract:
Federated Learning (FL) solves many of this decade's concerns regarding data privacy and computation challenges. FL ensures no data leaves its source as the model is trained at where the data resides. However, FL comes with its own set of challenges. The communication of model weight updates in this distributed environment comes with significant network bandwidth costs. In this context, we propose…
▽ More
Federated Learning (FL) solves many of this decade's concerns regarding data privacy and computation challenges. FL ensures no data leaves its source as the model is trained at where the data resides. However, FL comes with its own set of challenges. The communication of model weight updates in this distributed environment comes with significant network bandwidth costs. In this context, we propose a mechanism of compressing the weight updates using Autoencoders (AE), which learn the data features of the weight updates and subsequently perform compression. The encoder is set up on each of the nodes where the training is performed while the decoder is set up on the node where the weights are aggregated. This setup achieves compression through the encoder and recreates the weights at the end of every communication round using the decoder. This paper shows that the dynamic and orthogonal AE based weight compression technique could serve as an advantageous alternative (or an add-on) in a large scale FL, as it not only achieves compression ratios ranging from 500x to 1720x and beyond, but can also be modified based on the accuracy requirements, computational capacity, and other requirements of the given FL setup.
△ Less
Submitted 12 August, 2021;
originally announced August 2021.
-
Hope Speech detection in under-resourced Kannada language
Authors:
Adeep Hande,
Ruba Priyadharshini,
Anbukkarasi Sampath,
Kingston Pal Thamburaj,
Prabakaran Chandran,
Bharathi Raja Chakravarthi
Abstract:
Numerous methods have been developed to monitor the spread of negativity in modern years by eliminating vulgar, offensive, and fierce comments from social media platforms. However, there are relatively lesser amounts of study that converges on embracing positivity, reinforcing supportive and reassuring content in online forums. Consequently, we propose creating an English-Kannada Hope speech datas…
▽ More
Numerous methods have been developed to monitor the spread of negativity in modern years by eliminating vulgar, offensive, and fierce comments from social media platforms. However, there are relatively lesser amounts of study that converges on embracing positivity, reinforcing supportive and reassuring content in online forums. Consequently, we propose creating an English-Kannada Hope speech dataset, KanHope and comparing several experiments to benchmark the dataset. The dataset consists of 6,176 user-generated comments in code mixed Kannada scraped from YouTube and manually annotated as bearing hope speech or Not-hope speech. In addition, we introduce DC-BERT4HOPE, a dual-channel model that uses the English translation of KanHope for additional training to promote hope speech detection. The approach achieves a weighted F1-score of 0.756, bettering other models. Henceforth, KanHope aims to instigate research in Kannada while broadly promoting researchers to take a pragmatic approach towards online content that encourages, positive, and supportive.
△ Less
Submitted 5 December, 2021; v1 submitted 10 August, 2021;
originally announced August 2021.
-
Weight Divergence Driven Divide-and-Conquer Approach for Optimal Federated Learning from non-IID Data
Authors:
Pravin Chandran,
Raghavendra Bhat,
Avinash Chakravarthi,
Srikanth Chandar
Abstract:
Federated Learning allows training of data stored in distributed devices without the need for centralizing training data, thereby maintaining data privacy. Addressing the ability to handle data heterogeneity (non-identical and independent distribution or non-IID) is a key enabler for the wider deployment of Federated Learning. In this paper, we propose a novel Divide-and-Conquer training methodolo…
▽ More
Federated Learning allows training of data stored in distributed devices without the need for centralizing training data, thereby maintaining data privacy. Addressing the ability to handle data heterogeneity (non-identical and independent distribution or non-IID) is a key enabler for the wider deployment of Federated Learning. In this paper, we propose a novel Divide-and-Conquer training methodology that enables the use of the popular FedAvg aggregation algorithm by overcoming the acknowledged FedAvg limitations in non-IID environments. We propose a novel use of Cosine-distance based Weight Divergence metric to determine the exact point where a Deep Learning network can be divided into class agnostic initial layers and class-specific deep layers for performing a Divide and Conquer training. We show that the methodology achieves trained model accuracy at par (and in certain cases exceeding) with numbers achieved by state-of-the-art Aggregation algorithms like FedProx, FedMA, etc. Also, we show that this methodology leads to compute and bandwidth optimizations under certain documented conditions.
△ Less
Submitted 29 June, 2021; v1 submitted 28 June, 2021;
originally announced June 2021.
-
NTP : A Neural Network Topology Profiler
Authors:
Raghavendra Bhat,
Pravin Chandran,
Juby Jose,
Viswanath Dibbur,
Prakash Sirra Ajith
Abstract:
Performance of end-to-end neural networks on a given hardware platform is a function of its compute and memory signature, which in-turn, is governed by a wide range of parameters such as topology size, primitives used, framework used, batching strategy, latency requirements, precision etc. Current benchmarking tools suffer from limitations such as a) being either too granular like DeepBench [1] (o…
▽ More
Performance of end-to-end neural networks on a given hardware platform is a function of its compute and memory signature, which in-turn, is governed by a wide range of parameters such as topology size, primitives used, framework used, batching strategy, latency requirements, precision etc. Current benchmarking tools suffer from limitations such as a) being either too granular like DeepBench [1] (or) b) mandate a working implementation that is either framework specific or hardware-architecture specific or both (or) c) provide only high level benchmark metrics. In this paper, we present NTP (Neural Net Topology Profiler), a sophisticated benchmarking framework, to effectively identify memory and compute signature of an end-to-end topology on multiple hardware architectures, without the need for an actual implementation. NTP is tightly integrated with hardware specific benchmarking tools to enable exhaustive data collection and analysis. Using NTP, a deep learning researcher can quickly establish baselines needed to understand performance of an end-to-end neural network topology and make high level architectural decisions. Further, integration of NTP with frameworks like Tensorflow, Pytorch, Intel OpenVINO etc. allows for performance comparison along several vectors like a) Comparison of different frameworks on a given hardware b) Comparison of different hardware using a given framework c) Comparison across different heterogeneous hardware configurations for given framework etc. These capabilities empower a researcher to effortlessly make architectural decisions needed for achieving optimized performance on any hardware platform. The paper documents the architectural approach of NTP and demonstrates the capabilities of the tool by benchmarking Mozilla DeepSpeech, a popular Speech Recognition topology.
△ Less
Submitted 24 May, 2019; v1 submitted 22 May, 2019;
originally announced May 2019.
-
Inferring disease causing genes and their pathways: A mathematical perspective
Authors:
Jeethu V. Devasia,
Priya Chandran
Abstract:
A system level view of cellular processes for human and several organisms can be cap- tured by analyzing molecular interaction networks. A molecular interaction network formed of differentially expressed genes and their interactions helps to understand key players behind disease development. So, if the functions of these genes are blocked by altering their interactions, it would have a great impac…
▽ More
A system level view of cellular processes for human and several organisms can be cap- tured by analyzing molecular interaction networks. A molecular interaction network formed of differentially expressed genes and their interactions helps to understand key players behind disease development. So, if the functions of these genes are blocked by altering their interactions, it would have a great impact in controlling the disease. Due to this promising consequence, the problem of inferring disease causing genes and their pathways has attained a crucial position in computational biology research. However, considering the huge size of interaction networks, executing computations can be costly. Review of literatures shows that the methods proposed for finding the set of disease causing genes could be assessed in terms of their accuracy which a perfect algorithm would find. Along with accuracy, the time complexity of the method is also important, as high time complexities would limit the number of pathways that could be found within a pragmatic time interval.
△ Less
Submitted 30 October, 2016;
originally announced November 2016.
-
Dynamic Partitioning of Physical Memory Among Virtual Machines, ASMI:Architectural Support for Memory Isolation
Authors:
Jithin R,
Priya Chandran
Abstract:
Cloud computing relies on secure and efficient virtualization. Software level security solutions compromise the performance of virtual machines (VMs), as a large amount of computational power would be utilized for running the security modules. Moreover, software solutions are only as secure as the level that they work on. For example a security module on a hypervisor cannot provide security in the…
▽ More
Cloud computing relies on secure and efficient virtualization. Software level security solutions compromise the performance of virtual machines (VMs), as a large amount of computational power would be utilized for running the security modules. Moreover, software solutions are only as secure as the level that they work on. For example a security module on a hypervisor cannot provide security in the presence of an infected hypervisor. It is a challenge for virtualization technology architects to enhance the security of VMs without degrading their performance. Currently available server machines are not fully equipped to support a secure VM environment without compromising on performance. A few hardware modifications have been introduced by manufactures like Intel and AMD to provide a secure VM environment with low performance degradation. In this paper we propose a novel memory architecture model named \textit{ Architectural Support for Memory Isolation(ASMI)}, that can achieve a true isolated physical memory region to each VM without degrading performance. Along with true memory isolation, ASMI is designed to provide lower memory access times, better utilization of available memory, support for DMA isolation and support for platform independence for users of VMs.
△ Less
Submitted 10 March, 2015;
originally announced March 2015.