Skip to main content

Showing 1–6 of 6 results for author: Lokhande, M

Searching in archive eess. Search in all archives.
.
  1. arXiv:2506.08785  [pdf, other

    cs.AR cs.AI cs.CC eess.IV

    POLARON: Precision-aware On-device Learning and Adaptive Runtime-cONfigurable AI acceleration

    Authors: Mukul Lokhande, Santosh Kumar Vishvakarma

    Abstract: The increasing complexity of AI models requires flexible hardware capable of supporting diverse precision formats, particularly for energy-constrained edge platforms. This work presents PARV-CE, a SIMD-enabled, multi-precision MAC engine that performs efficient multiply-accumulate operations using a unified data-path for 4/8/16-bit fixed-point, floating point, and posit formats. The architecture i… ▽ More

    Submitted 10 June, 2025; originally announced June 2025.

  2. arXiv:2506.07046  [pdf, other

    cs.AR cs.CV cs.RO eess.IV

    QForce-RL: Quantized FPGA-Optimized Reinforcement Learning Compute Engine

    Authors: Anushka Jha, Tanushree Dewangan, Mukul Lokhande, Santosh Kumar Vishvakarma

    Abstract: Reinforcement Learning (RL) has outperformed other counterparts in sequential decision-making and dynamic environment control. However, FPGA deployment is significantly resource-expensive, as associated with large number of computations in training agents with high-quality images and possess new challenges. In this work, we propose QForce-RL takes benefits of quantization to enhance throughput and… ▽ More

    Submitted 8 June, 2025; originally announced June 2025.

  3. arXiv:2503.14354  [pdf, other

    cs.AR cs.AI cs.CV cs.ET eess.IV

    Retrospective: A CORDIC Based Configurable Activation Function for NN Applications

    Authors: Omkar Kokane, Gopal Raut, Salim Ullah, Mukul Lokhande, Adam Teman, Akash Kumar, Santosh Kumar Vishvakarma

    Abstract: A CORDIC-based configuration for the design of Activation Functions (AF) was previously suggested to accelerate ASIC hardware design for resource-constrained systems by providing functional reconfigurability. Since its introduction, this new approach for neural network acceleration has gained widespread popularity, influencing numerous designs for activation functions in both academic and commerci… ▽ More

    Submitted 18 March, 2025; originally announced March 2025.

  4. arXiv:2503.11685  [pdf, other

    cs.AR cs.CV eess.IV

    CORDIC Is All You Need

    Authors: Omkar Kokane, Adam Teman, Anushka Jha, Guru Prasath SL, Gopal Raut, Mukul Lokhande, S. V. Jaya Chand, Tanushree Dewangan, Santosh Kumar Vishvakarma

    Abstract: Artificial intelligence necessitates adaptable hardware accelerators for efficient high-throughput million operations. We present pipelined architecture with CORDIC block for linear MAC computations and nonlinear iterative Activation Functions (AF) such as $tanh$, $sigmoid$, and $softmax$. This approach focuses on a Reconfigurable Processing Engine (RPE) based systolic array, with 40\% pruning rat… ▽ More

    Submitted 4 March, 2025; originally announced March 2025.

  5. arXiv:2412.11702  [pdf, other

    cs.AR cs.CV cs.DC cs.ET eess.IV

    Flex-PE: Flexible and SIMD Multi-Precision Processing Element for AI Workloads

    Authors: Mukul Lokhande, Gopal Raut, Santosh Kumar Vishvakarma

    Abstract: The rapid adaptation of data driven AI models, such as deep learning inference, training, Vision Transformers (ViTs), and other HPC applications, drives a strong need for runtime precision configurable different non linear activation functions (AF) hardware support. Existing solutions support diverse precision or runtime AF reconfigurability but fail to address both simultaneously. This work propo… ▽ More

    Submitted 16 December, 2024; originally announced December 2024.

    Comments: 10 pages, 5 figures, Preprint, Submitted to TVLSI Regular papers

  6. arXiv:2409.04976  [pdf, other

    cs.AR cs.AI cs.CV eess.IV

    HYDRA: Hybrid Data Multiplexing and Run-time Layer Configurable DNN Accelerator

    Authors: Sonu Kumar, Komal Gupta, Gopal Raut, Mukul Lokhande, Santosh Kumar Vishvakarma

    Abstract: Deep neural networks (DNNs) offer plenty of challenges in executing efficient computation at edge nodes, primarily due to the huge hardware resource demands. The article proposes HYDRA, hybrid data multiplexing, and runtime layer configurable DNN accelerators to overcome the drawbacks. The work proposes a layer-multiplexed approach, which further reuses a single activation function within the exec… ▽ More

    Submitted 8 September, 2024; originally announced September 2024.