-
Machine Intelligence on Wireless Edge Networks
Authors:
Sri Krishna Vadlamani,
Kfir Sulimany,
Zhihui Gao,
Tingjun Chen,
Dirk Englund
Abstract:
Deep neural network (DNN) inference on power-constrained edge devices is bottlenecked by costly weight storage and data movement. We introduce MIWEN, a radio-frequency (RF) analog architecture that ``disaggregates'' memory by streaming weights wirelessly and performing classification in the analog front end of standard transceivers. By encoding weights and activations onto RF carriers and using na…
▽ More
Deep neural network (DNN) inference on power-constrained edge devices is bottlenecked by costly weight storage and data movement. We introduce MIWEN, a radio-frequency (RF) analog architecture that ``disaggregates'' memory by streaming weights wirelessly and performing classification in the analog front end of standard transceivers. By encoding weights and activations onto RF carriers and using native mixers as computation units, MIWEN eliminates local weight memory and the overhead of analog-to-digital and digital-to-analog conversion. We derive the effective number of bits of radio-frequency analog computation under thermal noise, quantify the energy--precision trade-off, and demonstrate digital-comparable MNIST accuracy at orders-of-magnitude lower energy, unlocking real-time inference on low-power, memory-free edge devices.
△ Less
Submitted 13 June, 2025;
originally announced June 2025.
-
Disaggregated Deep Learning via In-Physics Computing at Radio Frequency
Authors:
Zhihui Gao,
Sri Krishna Vadlamani,
Kfir Sulimany,
Dirk Englund,
Tingjun Chen
Abstract:
Modern edge devices, such as cameras, drones, and Internet-of-Things nodes, rely on deep learning to enable a wide range of intelligent applications, including object recognition, environment perception, and autonomous navigation. However, deploying deep learning models directly on the often resource-constrained edge devices demands significant memory footprints and computational power for real-ti…
▽ More
Modern edge devices, such as cameras, drones, and Internet-of-Things nodes, rely on deep learning to enable a wide range of intelligent applications, including object recognition, environment perception, and autonomous navigation. However, deploying deep learning models directly on the often resource-constrained edge devices demands significant memory footprints and computational power for real-time inference using traditional digital computing architectures. In this paper, we present WISE, a novel computing architecture for wireless edge networks designed to overcome energy constraints in deep learning inference. WISE achieves this goal through two key innovations: disaggregated model access via wireless broadcasting and in-physics computation of general complex-valued matrix-vector multiplications directly at radio frequency. Using a software-defined radio platform with wirelessly broadcast model weights over the air, we demonstrate that WISE achieves 95.7% image classification accuracy with ultra-low operation power of 6.0 fJ/MAC per client, corresponding to a computation efficiency of 165.8 TOPS/W. This approach enables energy-efficient deep learning inference on wirelessly connected edge devices, achieving more than two orders of magnitude improvement in efficiency compared to traditional digital computing.
△ Less
Submitted 24 April, 2025;
originally announced April 2025.
-
Micro-Ring Perceptron Sensor for High-Speed, Low-Power Radio-Frequency Signal
Authors:
Bo-Han Wu,
Shi-Yuan Ma,
Sri Krishna Vadlamani,
Hyeongrak Choi,
Dirk Englund
Abstract:
Radio-frequency (RF) sensing enables long-range, high-resolution detection for applications such as radar and wireless communication. RF photonic sensing mitigates the bandwidth limitations and high transmission losses of electronic systems by transducing the detected RF signals into broadband optical carriers. However, these sensing systems remain limited by detector noise and Nyquist rate sampli…
▽ More
Radio-frequency (RF) sensing enables long-range, high-resolution detection for applications such as radar and wireless communication. RF photonic sensing mitigates the bandwidth limitations and high transmission losses of electronic systems by transducing the detected RF signals into broadband optical carriers. However, these sensing systems remain limited by detector noise and Nyquist rate sampling with analog-to-digital converters, particularly under low-power and high-data rate conditions. To overcome these limitations, we introduce the micro-ring perceptron (MiRP) sensor, a physics-inspired AI framework that integrates the micro-ring (MiR) dynamics-based analog processor with a machine-learning-driven digital backend. By embedding the nonlinear optical dynamics of MiRs into an end-to-end architecture, MiRP sensing maps the input signal into a learned feature space for the subsequent digital neural network. The trick is to encode the entire temporal structure of the incoming signal into each output sample in order to enable effectively sub-Nyquist sampling without loss of task-relevant information. Evaluations of three target classification datasets demonstrate the performance advantages of MiRP sensing. For example, on MNIST, MiRP detection achieves $94\pm0.1$\% accuracy at $1/49$ the Nyquist rate at the input RF signal of $1$~ pW, compared to $11\pm0.4$\% for the conventional RF detection method. Thus, our sensor framework provides a robust and efficient solution for the detection of low-power and high-speed signals in real-world sensing applications.
△ Less
Submitted 18 April, 2025;
originally announced April 2025.
-
Small Vision-Language Models: A Survey on Compact Architectures and Techniques
Authors:
Nitesh Patnaik,
Navdeep Nayak,
Himani Bansal Agrawal,
Moinak Chinmoy Khamaru,
Gourav Bal,
Saishree Smaranika Panda,
Rishi Raj,
Vishal Meena,
Kartheek Vadlamani
Abstract:
The emergence of small vision-language models (sVLMs) marks a critical advancement in multimodal AI, enabling efficient processing of visual and textual data in resource-constrained environments. This survey offers a comprehensive exploration of sVLM development, presenting a taxonomy of architectures - transformer-based, mamba-based, and hybrid - that highlight innovations in compact design and c…
▽ More
The emergence of small vision-language models (sVLMs) marks a critical advancement in multimodal AI, enabling efficient processing of visual and textual data in resource-constrained environments. This survey offers a comprehensive exploration of sVLM development, presenting a taxonomy of architectures - transformer-based, mamba-based, and hybrid - that highlight innovations in compact design and computational efficiency. Techniques such as knowledge distillation, lightweight attention mechanisms, and modality pre-fusion are discussed as enablers of high performance with reduced resource requirements. Through an in-depth analysis of models like TinyGPT-V, MiniGPT-4, and VL-Mamba, we identify trade-offs between accuracy, efficiency, and scalability. Persistent challenges, including data biases and generalization to complex tasks, are critically examined, with proposed pathways for addressing them. By consolidating advancements in sVLMs, this work underscores their transformative potential for accessible AI, setting a foundation for future research into efficient multimodal systems.
△ Less
Submitted 9 March, 2025;
originally announced March 2025.
-
QAMNet: Fast and Efficient Optical QAM Neural Networks
Authors:
Marc Gong Bacvanski,
Sri Krishna Vadlamani,
Kfir Sulimany,
Dirk Robert Englund
Abstract:
The energy consumption of neural network inference has become a topic of paramount importance with the growing success and adoption of deep neural networks. Analog optical neural networks (ONNs) can reduce the energy of matrix-vector multiplication in neural network inference below that of digital electronics. However, realizing this promise remains challenging due to digital-to-analog conversion:…
▽ More
The energy consumption of neural network inference has become a topic of paramount importance with the growing success and adoption of deep neural networks. Analog optical neural networks (ONNs) can reduce the energy of matrix-vector multiplication in neural network inference below that of digital electronics. However, realizing this promise remains challenging due to digital-to-analog conversion: even at low bit precisions $b$, encoding the $2^b$ levels of digital weights and inputs into the analog domain requires specialized and power-hungry electronics. Faced with similar challenges, the field of telecommunications has developed the complex-valued Quadrature-Amplitude Modulation (QAM), the workhorse modulation format for decades. QAM maximally exploits the complex amplitude to provide a quadratic $O(N^2) \to O(N)$ energy saving over intensity-only modulation. Inspired by this advantage, this work introduces QAMNet, an optical neural network hardware and architecture with superior energy consumption to existing ONNs, that fully utilizes the complex nature of the amplitude of light with QAM. When implemented with conventional telecommunications equipment, we show that QAMNet accelerates complex-valued deep neural networks with accuracies indistinguishable from digital hardware, based on physics-based simulations. Compared to standard ONNs, we find that QAMNet ONNs: (1) attain higher accuracy above moderate levels of total bit precision, (2) are more accurate above low energy budgets, and (3) are an optimal choice when hardware bit precision is limited.
△ Less
Submitted 19 September, 2024; v1 submitted 18 September, 2024;
originally announced September 2024.
-
Improved Differential Evolution based Feature Selection through Quantum, Chaos, and Lasso
Authors:
Yelleti Vivek,
Sri Krishna Vadlamani,
Vadlamani Ravi,
P. Radha Krishna
Abstract:
Modern deep learning continues to achieve outstanding performance on an astounding variety of high-dimensional tasks. In practice, this is obtained by fitting deep neural models to all the input data with minimal feature engineering, thus sacrificing interpretability in many cases. However, in applications such as medicine, where interpretability is crucial, feature subset selection becomes an imp…
▽ More
Modern deep learning continues to achieve outstanding performance on an astounding variety of high-dimensional tasks. In practice, this is obtained by fitting deep neural models to all the input data with minimal feature engineering, thus sacrificing interpretability in many cases. However, in applications such as medicine, where interpretability is crucial, feature subset selection becomes an important problem. Metaheuristics such as Binary Differential Evolution are a popular approach to feature selection, and the research literature continues to introduce novel ideas, drawn from quantum computing and chaos theory, for instance, to improve them. In this paper, we demonstrate that introducing chaos-generated variables, generated from considerations of the Lyapunov time, in place of random variables in quantum-inspired metaheuristics significantly improves their performance on high-dimensional medical classification tasks and outperforms other approaches. We show that this chaos-induced improvement is a general phenomenon by demonstrating it for multiple varieties of underlying quantum-inspired metaheuristics. Performance is further enhanced through Lasso-assisted feature pruning. At the implementation level, we vastly speed up our algorithms through a scalable island-based computing cluster parallelization technique.
△ Less
Submitted 20 August, 2024;
originally announced August 2024.
-
Towards the Information-Theoretic Limit of Programmable Photonics
Authors:
Ryan Hamerly,
Jasvith Raj Basani,
Alexander Sludds,
Sri Krishna Vadlamani,
Dirk Englund
Abstract:
The scalability of many programmable photonic circuits is limited by the $2π$ tuning range needed for the constituent phase shifters. To address this problem, we introduce the concept of a phase-efficient circuit architecture, where the average phase shift is $\ll 2π$. We derive a universal information-theoretic limit to the phase-shift efficiency of universal multiport interferometers, and propos…
▽ More
The scalability of many programmable photonic circuits is limited by the $2π$ tuning range needed for the constituent phase shifters. To address this problem, we introduce the concept of a phase-efficient circuit architecture, where the average phase shift is $\ll 2π$. We derive a universal information-theoretic limit to the phase-shift efficiency of universal multiport interferometers, and propose a "3-MZI" architecture that approaches this limit to within a factor of $2\times$, approximately a $10\times$ reduction in average phase shift over the prior art, where the average phase shift scales inversely with system size as $O(1/\sqrt{N})$. For non-unitary circuits, we show that the 3-MZI saturates the theoretical bound for Gaussian-distributed target matrices. Using this architecture, we show optical neural network training with all phase shifters constrained to $\lesssim 0.2$ radians without loss of accuracy.
△ Less
Submitted 18 August, 2024;
originally announced August 2024.
-
Quantum-secure multiparty deep learning
Authors:
Kfir Sulimany,
Sri Krishna Vadlamani,
Ryan Hamerly,
Prahlad Iyengar,
Dirk Englund
Abstract:
Secure multiparty computation enables the joint evaluation of multivariate functions across distributed users while ensuring the privacy of their local inputs. This field has become increasingly urgent due to the exploding demand for computationally intensive deep learning inference. These computations are typically offloaded to cloud computing servers, leading to vulnerabilities that can compromi…
▽ More
Secure multiparty computation enables the joint evaluation of multivariate functions across distributed users while ensuring the privacy of their local inputs. This field has become increasingly urgent due to the exploding demand for computationally intensive deep learning inference. These computations are typically offloaded to cloud computing servers, leading to vulnerabilities that can compromise the security of the clients' data. To solve this problem, we introduce a linear algebra engine that leverages the quantum nature of light for information-theoretically secure multiparty computation using only conventional telecommunication components. We apply this linear algebra engine to deep learning and derive rigorous upper bounds on the information leakage of both the deep neural network weights and the client's data via the Holevo and the Cramér-Rao bounds, respectively. Applied to the MNIST classification task, we obtain test accuracies exceeding $96\%$ while leaking less than $0.1$ bits per weight symbol and $0.01$ bits per data symbol. This weight leakage is an order of magnitude below the minimum bit precision required for accurate deep learning using state-of-the-art quantization techniques. Our work lays the foundation for practical quantum-secure computation and unlocks secure cloud deep learning as a field.
△ Less
Submitted 13 September, 2024; v1 submitted 10 August, 2024;
originally announced August 2024.
-
Transferable Learning on Analog Hardware
Authors:
Sri Krishna Vadlamani,
Dirk Englund,
Ryan Hamerly
Abstract:
While analog neural network (NN) accelerators promise massive energy and time savings, an important challenge is to make them robust to static fabrication error. Present-day training methods for programmable photonic interferometer circuits, a leading analog NN platform, do not produce networks that perform well in the presence of static hardware errors. Moreover, existing hardware error correctio…
▽ More
While analog neural network (NN) accelerators promise massive energy and time savings, an important challenge is to make them robust to static fabrication error. Present-day training methods for programmable photonic interferometer circuits, a leading analog NN platform, do not produce networks that perform well in the presence of static hardware errors. Moreover, existing hardware error correction techniques either require individual re-training of every analog NN (which is impractical in an edge setting with millions of devices), place stringent demands on component quality, or introduce hardware overhead. We solve all three problems by introducing one-time error-aware training techniques that produce robust NNs that match the performance of ideal hardware and can be exactly transferred to arbitrary highly faulty photonic NNs with hardware errors up to 5x larger than present-day fabrication tolerances.
△ Less
Submitted 12 October, 2022;
originally announced October 2022.
-
A Self-Similar Sine-Cosine Fractal Architecture for Multiport Interferometers
Authors:
Jasvith Raj Basani,
Sri Krishna Vadlamani,
Saumil Bandyopadhyay,
Dirk R. Englund,
Ryan Hamerly
Abstract:
Multiport interferometers based on integrated beamsplitter meshes have recently captured interest as a platform for many emerging technologies. In this paper, we present a novel architecture for multiport interferometers based on the Sine-Cosine fractal decomposition of a unitary matrix. Our architecture is unique in that it is self-similar, enabling the construction of modular multi-chiplet devic…
▽ More
Multiport interferometers based on integrated beamsplitter meshes have recently captured interest as a platform for many emerging technologies. In this paper, we present a novel architecture for multiport interferometers based on the Sine-Cosine fractal decomposition of a unitary matrix. Our architecture is unique in that it is self-similar, enabling the construction of modular multi-chiplet devices. Due to this modularity, our design enjoys improved resilience to hardware imperfections as compared to conventional multiport interferometers. Additionally, the structure of our circuit enables systematic truncation, which is key in reducing the hardware footprint of the chip as well as compute time in training optical neural networks, while maintaining full connectivity. Numerical simulations show that truncation of these meshes gives robust performance even under large fabrication errors. This design is a step forward in the construction of large-scale programmable photonics, removing a major hurdle in scaling up to practical machine learning and quantum computing applications.
△ Less
Submitted 7 September, 2022;
originally announced September 2022.
-
Equivalence of coupled parametric oscillator dynamics to Lagrange multiplier primal-dual optimization
Authors:
Sri Krishna Vadlamani,
Tianyao Patrick Xiao,
Eli Yablonovitch
Abstract:
There has been a recent surge of interest in physics-based solvers for combinatorial optimization problems. We present a dynamical solver for the Ising problem that is comprised of a network of coupled parametric oscillators and show that it implements Lagrange multiplier constrained optimization. We show that the pump depletion effect, which is intrinsic to parametric oscillators, enforces binary…
▽ More
There has been a recent surge of interest in physics-based solvers for combinatorial optimization problems. We present a dynamical solver for the Ising problem that is comprised of a network of coupled parametric oscillators and show that it implements Lagrange multiplier constrained optimization. We show that the pump depletion effect, which is intrinsic to parametric oscillators, enforces binary constraints and enables the system's continuous analog variables to converge to the optimal binary solutions to the optimization problem. Moreover, there is an exact correspondence between the equations of motion for the coupled oscillators and the update rules in the primal-dual method of Lagrange multipliers. Though our analysis is performed using electrical LC oscillators, it can be generalized to any system of coupled parametric oscillators. We simulate the dynamics of the coupled oscillator system and demonstrate that the performance of the solver on a set of benchmark problems is comparable to the best-known results obtained by digital algorithms in the literature.
△ Less
Submitted 5 April, 2022;
originally announced April 2022.
-
A Novel Indoor Positioning System for unprepared firefighting scenarios
Authors:
Vamsi Karthik Vadlamani,
Manish Bhattarai,
Meenu Ajith,
Manel Martınez-Ramon
Abstract:
Situational awareness and Indoor location tracking for firefighters is one of the tasks with paramount importance in search and rescue operations. For Indoor Positioning systems (IPS), GPS is not the best possible solution. There are few other techniques like dead reckoning, Wifi and bluetooth based triangulation, Structure from Motion (SFM) based scene reconstruction for Indoor positioning system…
▽ More
Situational awareness and Indoor location tracking for firefighters is one of the tasks with paramount importance in search and rescue operations. For Indoor Positioning systems (IPS), GPS is not the best possible solution. There are few other techniques like dead reckoning, Wifi and bluetooth based triangulation, Structure from Motion (SFM) based scene reconstruction for Indoor positioning system. However due to high temperatures, the rapidly changing environment of fires, and low parallax in the thermal images, these techniques are not suitable for relaying the necessary information in a fire fighting environment needed to increase situational awareness in real time. In fire fighting environments, thermal imaging cameras are used due to smoke and low visibility hence obtaining relative orientation from the vanishing point estimation is very difficult. The following technique that is the content of this research implements a novel optical flow based video compass for orientation estimation and fused IMU data based activity recognition for IPS. This technique helps first responders to go into unprepared, unknown environments and still maintain situational awareness like the orientation and, position of the victim fire fighters.
△ Less
Submitted 4 August, 2020;
originally announced August 2020.
-
Physics Successfully Implements Lagrange Multiplier Optimization
Authors:
Sri Krishna Vadlamani,
Tianyao Patrick Xiao,
Eli Yablonovitch
Abstract:
Optimization is a major part of human effort. While being mathematical, optimization is also built into physics. For example, physics has the principle of Least Action, the principle of Minimum Entropy Generation, and the Variational Principle. Physics also has physical annealing which, of course, preceded computational Simulated Annealing. Physics has the Adiabatic Principle, which in its quantum…
▽ More
Optimization is a major part of human effort. While being mathematical, optimization is also built into physics. For example, physics has the principle of Least Action, the principle of Minimum Entropy Generation, and the Variational Principle. Physics also has physical annealing which, of course, preceded computational Simulated Annealing. Physics has the Adiabatic Principle, which in its quantum form is called Quantum Annealing. Thus, physical machines can solve the mathematical problem of optimization, including constraints. Binary constraints can be built into the physical optimization. In that case the machines are digital in the same sense that a flip-flop is digital. A wide variety of machines have had recent success at optimizing the Ising magnetic energy. We demonstrate in this paper that almost all those machines perform optimization according to the Principle of Minimum Entropy Generation as put forth by Onsager. Further, we show that this optimization is in fact equivalent to Lagrange multiplier optimization for constrained problems. We find that the physical gain coefficients which drive those systems actually play the role of the corresponding Lagrange Multipliers.
△ Less
Submitted 21 July, 2020; v1 submitted 11 July, 2020;
originally announced July 2020.