Search | arXiv e-print repository

arXiv:1902.03650 [pdf, other]

doi 10.1109/LMAG.2019.2910787

Low Barrier Magnet Design for Efficient Hardware Binary Stochastic Neurons

Authors: Orchi Hassan, Rafatul Faria, Kerem Y. Camsari, Jonathan Z. Sun, Supriyo Datta

Abstract: Binary stochastic neurons (BSN's) form an integral part of many machine learning algorithms, motivating the development of hardware accelerators for this complex function. It has been recognized that hardware BSN's can be implemented using low barrier magnets (LBM's) by minimally modifying present-day magnetoresistive random access memory (MRAM) devices. A crucial parameter that determines the res… ▽ More Binary stochastic neurons (BSN's) form an integral part of many machine learning algorithms, motivating the development of hardware accelerators for this complex function. It has been recognized that hardware BSN's can be implemented using low barrier magnets (LBM's) by minimally modifying present-day magnetoresistive random access memory (MRAM) devices. A crucial parameter that determines the response of these LBM based BSN designs is the \emph{correlation time} of magnetization, $τ_c$. In this letter, we show that for magnets with low energy barriers ($Δ\approx k_BT$ and below), circular disk magnets with in-plane magnetic anisotropy (IMA) lead to $τ_c$ values that are two orders of magnitude smaller compared to $τ_c$ for magnets having perpendicular magnetic anisotropy (PMA) and provide analytical descriptions. We show that this striking difference in $τ_c$ is due to a precession-like fluctuation mechanism that is enabled by the large demagnetization field in IMA magnets. We provide a detailed energy-delay performance evaluation of previously proposed BSN designs based on Spin-Orbit-Torque (SOT) MRAM and Spin-Transfer-Torque (STT) MRAM employing low barrier circular IMA magnets by SPICE simulations. The designs exhibit sub-ns response times leading to energy requirements of $\sim$a few fJ to evaluate the BSN function, orders of magnitude lower than digital CMOS implementations with a much larger footprint. While modern MRAM technology is based on PMA magnets, results in this paper suggest that low barrier circular IMA magnets may be more suitable for this application. △ Less

Submitted 20 April, 2019; v1 submitted 10 February, 2019; originally announced February 2019.

Journal ref: IEEE Magnetics Letters (2019)

arXiv:1210.8400 [pdf, ps, other]

doi 10.1109/TCOMM.2013.071813.120833

Distributed Quantization Networks

Authors: John Z. Sun, Vivek K. Goyal

Abstract: Several key results in distributed source coding offer the intuition that little improvement in compression can be gained from intersensor communication when the information is coded in long blocks. However, when sensors are restricted to code their observations in small blocks (e.g., 1), intelligent collaboration between sensors can greatly reduce distortion. For networks where sensors are allowe… ▽ More Several key results in distributed source coding offer the intuition that little improvement in compression can be gained from intersensor communication when the information is coded in long blocks. However, when sensors are restricted to code their observations in small blocks (e.g., 1), intelligent collaboration between sensors can greatly reduce distortion. For networks where sensors are allowed to "chat" using a side channel that is unobservable at the fusion center, we provide asymptotically-exact characterization of distortion performance and optimal quantizer design in the high-resolution (low-distortion) regime using a framework called distributed functional scalar quantization (DFSQ). The key result is that chatting can dramatically improve performance even when intersensor communication is at very low rate, especially if the fusion center desires fidelity of a nonlinear computation applied to source realizations rather than fidelity in representing the sources themselves. We also solve the rate allocation problem when communication links have heterogeneous costs and provide a detailed example to demonstrate the theoretical and practical gains from chatting. This example for maximum computation gives insight on the gap between chatting and distributed networks, and how to optimize the intersensor communication. △ Less

Submitted 31 October, 2012; originally announced October 2012.

Journal ref: IEEE Trans. on Communications, vol. 61, no. 9, pp. 3931-3942, September 2013

arXiv:1206.1299 [pdf, ps, other]

doi 10.1109/TSP.2013.2259483

Distributed Functional Scalar Quantization Simplified

Authors: John Z. Sun, Vinith Misra, Vivek K Goyal

Abstract: Distributed functional scalar quantization (DFSQ) theory provides optimality conditions and predicts performance of data acquisition systems in which a computation on acquired data is desired. We address two limitations of previous works: prohibitively expensive decoder design and a restriction to sources with bounded distributions. We rigorously show that a much simpler decoder has equivalent asy… ▽ More Distributed functional scalar quantization (DFSQ) theory provides optimality conditions and predicts performance of data acquisition systems in which a computation on acquired data is desired. We address two limitations of previous works: prohibitively expensive decoder design and a restriction to sources with bounded distributions. We rigorously show that a much simpler decoder has equivalent asymptotic performance as the conditional expectation estimator previously explored, thus reducing decoder design complexity. The simpler decoder has the feature of decoupled communication and computation blocks. Moreover, we extend the DFSQ framework with the simpler decoder to acquire sources with infinite-support distributions such as Gaussian or exponential distributions. Finally, through simulation results we demonstrate that performance at moderate coding rates is well predicted by the asymptotic analysis, and we give new insight on the rate of convergence. △ Less

Submitted 6 June, 2012; originally announced June 2012.

Journal ref: IEEE Trans. on Signal Processing, vol. 61, no. 14, pp. 3495-3508, July 2013

arXiv:1110.2098 [pdf, ps, other]

Dynamic Matrix Factorization: A State Space Approach

Authors: John Z. Sun, Kush R. Varshney, Karthik Subbian

Abstract: Matrix factorization from a small number of observed entries has recently garnered much attention as the key ingredient of successful recommendation systems. One unresolved problem in this area is how to adapt current methods to handle changing user preferences over time. Recent proposals to address this issue are heuristic in nature and do not fully exploit the time-dependent structure of the pro… ▽ More Matrix factorization from a small number of observed entries has recently garnered much attention as the key ingredient of successful recommendation systems. One unresolved problem in this area is how to adapt current methods to handle changing user preferences over time. Recent proposals to address this issue are heuristic in nature and do not fully exploit the time-dependent structure of the problem. As a principled and general temporal formulation, we propose a dynamical state space model of matrix factorization. Our proposal builds upon probabilistic matrix factorization, a Bayesian model with Gaussian priors. We utilize results in state tracking, such as the Kalman filter, to provide accurate recommendations in the presence of both process and measurement noise. We show how system parameters can be learned via expectation-maximization and provide comparisons to current published techniques. △ Less

Submitted 4 August, 2012; v1 submitted 10 October, 2011; originally announced October 2011.

arXiv:0905.2214 [pdf, ps, other]

The Rainbow Skip Graph: A Fault-Tolerant Constant-Degree P2P Relay Structure

Authors: Michael T. Goodrich, Michael J. Nelson, Jonathan Z. Sun

Abstract: We present a distributed data structure, which we call the rainbow skip graph. To our knowledge, this is the first peer-to-peer data structure that simultaneously achieves high fault tolerance, constant-sized nodes, and fast update and query times for ordered data. It is a non-trivial adaptation of the SkipNet/skip-graph structures of Harvey et al. and Aspnes and Shah, so as to provide fault-tol… ▽ More We present a distributed data structure, which we call the rainbow skip graph. To our knowledge, this is the first peer-to-peer data structure that simultaneously achieves high fault tolerance, constant-sized nodes, and fast update and query times for ordered data. It is a non-trivial adaptation of the SkipNet/skip-graph structures of Harvey et al. and Aspnes and Shah, so as to provide fault-tolerance as these structures do, but to do so using constant-sized nodes, as in the family tree structure of Zatloukal and Harvey. It supports successor queries on a set of n items using O(log n) messages with high probability, an improvement over the expected O(log n) messages of the family tree. △ Less

Submitted 13 May, 2009; originally announced May 2009.

Comments: Expanded version of a paper appearing in ACM-SIAM Symp. on Discrete Algorithms (SODA)

arXiv:cs/0507049 [pdf, ps, other]

The Skip Quadtree: A Simple Dynamic Data Structure for Multidimensional Data

Authors: David Eppstein, Michael T. Goodrich, Jonathan Z. Sun

Abstract: We present a new multi-dimensional data structure, which we call the skip quadtree (for point data in R^2) or the skip octree (for point data in R^d, with constant d>2). Our data structure combines the best features of two well-known data structures, in that it has the well-defined "box"-shaped regions of region quadtrees and the logarithmic-height search and update hierarchical structure of ski… ▽ More We present a new multi-dimensional data structure, which we call the skip quadtree (for point data in R^2) or the skip octree (for point data in R^d, with constant d>2). Our data structure combines the best features of two well-known data structures, in that it has the well-defined "box"-shaped regions of region quadtrees and the logarithmic-height search and update hierarchical structure of skip lists. Indeed, the bottom level of our structure is exactly a region quadtree (or octree for higher dimensional data). We describe efficient algorithms for inserting and deleting points in a skip quadtree, as well as fast methods for performing point location and approximate range queries. △ Less

Submitted 19 July, 2005; originally announced July 2005.

Comments: 12 pages, 3 figures. A preliminary version of this paper appeared in the 21st ACM Symp. Comp. Geom., Pisa, 2005, pp. 296-305

ACM Class: F.2.2

Showing 1–6 of 6 results for author: Sun, J Z