Search | arXiv e-print repository

Learning from learning machines: a new generation of AI technology to meet the needs of science

Authors: Luca Pion-Tonachini, Kristofer Bouchard, Hector Garcia Martin, Sean Peisert, W. Bradley Holtz, Anil Aswani, Dipankar Dwivedi, Haruko Wainwright, Ghanshyam Pilania, Benjamin Nachman, Babetta L. Marrone, Nicola Falco, Prabhat, Daniel Arnold, Alejandro Wolf-Yadlin, Sarah Powers, Sharlee Climer, Quinn Jackson, Ty Carlson, Michael Sohn, Petrus Zwart, Neeraj Kumar, Amy Justice, Claire Tomlin, Daniel Jacobson , et al. (11 additional authors not shown)

Abstract: We outline emerging opportunities and challenges to enhance the utility of AI for scientific discovery. The distinct goals of AI for industry versus the goals of AI for science create tension between identifying patterns in data versus discovering patterns in the world from data. If we address the fundamental challenges associated with "bridging the gap" between domain-driven scientific models and… ▽ More We outline emerging opportunities and challenges to enhance the utility of AI for scientific discovery. The distinct goals of AI for industry versus the goals of AI for science create tension between identifying patterns in data versus discovering patterns in the world from data. If we address the fundamental challenges associated with "bridging the gap" between domain-driven scientific models and data-driven AI learning machines, then we expect that these AI models can transform hypothesis generation, scientific discovery, and the scientific process itself. △ Less

Submitted 26 November, 2021; originally announced November 2021.

arXiv:2110.08271 [pdf, other]

Training Deep Neural Networks with Joint Quantization and Pruning of Weights and Activations

Authors: Xinyu Zhang, Ian Colbert, Ken Kreutz-Delgado, Srinjoy Das

Abstract: Quantization and pruning are core techniques used to reduce the inference costs of deep neural networks. State-of-the-art quantization techniques are currently applied to both the weights and activations; however, pruning is most often applied to only the weights of the network. In this work, we jointly apply novel uniform quantization and unstructured pruning methods to both the weights and activ… ▽ More Quantization and pruning are core techniques used to reduce the inference costs of deep neural networks. State-of-the-art quantization techniques are currently applied to both the weights and activations; however, pruning is most often applied to only the weights of the network. In this work, we jointly apply novel uniform quantization and unstructured pruning methods to both the weights and activations of deep neural networks during training. Using our methods, we empirically evaluate the currently accepted prune-then-quantize paradigm across a wide range of computer vision tasks and observe a non-commutative nature when applied to both the weights and activations of deep neural networks. Informed by these observations, we articulate the non-commutativity hypothesis: for a given deep neural network being trained for a specific task, there exists an exact training schedule in which quantization and pruning can be introduced to optimize network performance. We identify that this optimal ordering not only exists, but also varies across discriminative and generative tasks. Using the optimal training schedule within our training framework, we demonstrate increased performance per memory footprint over existing solutions. △ Less

Submitted 1 November, 2021; v1 submitted 15 October, 2021; originally announced October 2021.

arXiv:2110.02690 [pdf, ps, other]

Tuning Confidence Bound for Stochastic Bandits with Bandit Distance

Authors: Xinyu Zhang, Srinjoy Das, Ken Kreutz-Delgado

Abstract: We propose a novel modification of the standard upper confidence bound (UCB) method for the stochastic multi-armed bandit (MAB) problem which tunes the confidence bound of a given bandit based on its distance to others. Our UCB distance tuning (UCB-DT) formulation enables improved performance as measured by expected regret by preventing the MAB algorithm from focusing on non-optimal bandits which… ▽ More We propose a novel modification of the standard upper confidence bound (UCB) method for the stochastic multi-armed bandit (MAB) problem which tunes the confidence bound of a given bandit based on its distance to others. Our UCB distance tuning (UCB-DT) formulation enables improved performance as measured by expected regret by preventing the MAB algorithm from focusing on non-optimal bandits which is a well-known deficiency of standard UCB. "Distance tuning" of the standard UCB is done using a proposed distance measure, which we call bandit distance, that is parameterizable and which therefore can be optimized to control the transition rate from exploration to exploitation based on problem requirements. We empirically demonstrate increased performance of UCB-DT versus many existing state-of-the-art methods which use the UCB formulation for the MAB problem. Our contribution also includes the development of a conceptual tool called the "Exploration Bargain Point" which gives insights into the tradeoffs between exploration and exploitation. We argue that the Exploration Bargain Point provides an intuitive perspective that is useful for comparatively analyzing the performance of UCB-based methods. △ Less

Submitted 6 October, 2021; originally announced October 2021.

arXiv:2107.07647 [pdf, other]

An Energy-Efficient Edge Computing Paradigm for Convolution-based Image Upsampling

Authors: Ian Colbert, Ken Kreutz-Delgado, Srinjoy Das

Abstract: A novel energy-efficient edge computing paradigm is proposed for real-time deep learning-based image upsampling applications. State-of-the-art deep learning solutions for image upsampling are currently trained using either resize or sub-pixel convolution to learn kernels that generate high fidelity images with minimal artifacts. However, performing inference with these learned convolution kernels… ▽ More A novel energy-efficient edge computing paradigm is proposed for real-time deep learning-based image upsampling applications. State-of-the-art deep learning solutions for image upsampling are currently trained using either resize or sub-pixel convolution to learn kernels that generate high fidelity images with minimal artifacts. However, performing inference with these learned convolution kernels requires memory-intensive feature map transformations that dominate time and energy costs in real-time applications. To alleviate this pressure on memory bandwidth, we confine the use of resize or sub-pixel convolution to training in the cloud by transforming learned convolution kernels to deconvolution kernels before deploying them for inference as a functionally equivalent deconvolution. These kernel transformations, intended as a one-time cost when shifting from training to inference, enable a systems designer to use each algorithm in their optimal context by preserving the image fidelity learned when training in the cloud while minimizing data transfer penalties during inference at the edge. We also explore existing variants of deconvolution inference algorithms and introduce a novel variant for consideration. We analyze and compare the inference properties of convolution-based upsampling algorithms using a quantitative model of incurred time and energy costs and show that using deconvolution for inference at the edge improves both system latency and energy efficiency when compared to their sub-pixel or resize convolution counterparts. △ Less

Submitted 26 July, 2021; v1 submitted 15 July, 2021; originally announced July 2021.

arXiv:2102.00534 [pdf]

Generative and Discriminative Deep Belief Network Classifiers: Comparisons Under an Approximate Computing Framework

Authors: Siqiao Ruan, Ian Colbert, Ken Kreutz-Delgado, Srinjoy Das

Abstract: The use of Deep Learning hardware algorithms for embedded applications is characterized by challenges such as constraints on device power consumption, availability of labeled data, and limited internet bandwidth for frequent training on cloud servers. To enable low power implementations, we consider efficient bitwidth reduction and pruning for the class of Deep Learning algorithms known as Discrim… ▽ More The use of Deep Learning hardware algorithms for embedded applications is characterized by challenges such as constraints on device power consumption, availability of labeled data, and limited internet bandwidth for frequent training on cloud servers. To enable low power implementations, we consider efficient bitwidth reduction and pruning for the class of Deep Learning algorithms known as Discriminative Deep Belief Networks (DDBNs) for embedded-device classification tasks. We train DDBNs with both generative and discriminative objectives under an approximate computing framework and analyze their power-at-performance for supervised and semi-supervised applications. We also investigate the out-of-distribution performance of DDBNs when the inference data has the same class structure yet is statistically different from the training data owing to dynamic real-time operating environments. Based on our analysis, we provide novel insights and recommendations for choice of training objectives, bitwidth values, and accuracy sensitivity with respect to the amount of labeled data for implementing DDBN inference with minimum power consumption on embedded hardware platforms subject to accuracy tolerances. △ Less

Submitted 31 January, 2021; originally announced February 2021.

arXiv:2102.00294 [pdf, other]

A Competitive Edge: Can FPGAs Beat GPUs at DCNN Inference Acceleration in Resource-Limited Edge Computing Applications?

Authors: Ian Colbert, Jake Daly, Ken Kreutz-Delgado, Srinjoy Das

Abstract: When trained as generative models, Deep Learning algorithms have shown exceptional performance on tasks involving high dimensional data such as image denoising and super-resolution. In an increasingly connected world dominated by mobile and edge devices, there is surging demand for these algorithms to run locally on embedded platforms. FPGAs, by virtue of their reprogrammability and low-power char… ▽ More When trained as generative models, Deep Learning algorithms have shown exceptional performance on tasks involving high dimensional data such as image denoising and super-resolution. In an increasingly connected world dominated by mobile and edge devices, there is surging demand for these algorithms to run locally on embedded platforms. FPGAs, by virtue of their reprogrammability and low-power characteristics, are ideal candidates for these edge computing applications. As such, we design a spatio-temporally parallelized hardware architecture capable of accelerating a deconvolution algorithm optimized for power-efficient inference on a resource-limited FPGA. We propose this FPGA-based accelerator to be used for Deconvolutional Neural Network (DCNN) inference in low-power edge computing applications. To this end, we develop methods that systematically exploit micro-architectural innovations, design space exploration, and statistical analysis. Using a Xilinx PYNQ-Z2 FPGA, we leverage our architecture to accelerate inference for two DCNNs trained on the MNIST and CelebA datasets using the Wasserstein GAN framework. On these networks, our FPGA design achieves a higher throughput to power ratio with lower run-to-run variation when compared to the NVIDIA Jetson TX1 edge computing GPU. △ Less

Submitted 9 March, 2021; v1 submitted 30 January, 2021; originally announced February 2021.

arXiv:1910.12454 [pdf, other]

PT-MMD: A Novel Statistical Framework for the Evaluation of Generative Systems

Authors: Alexander Potapov, Ian Colbert, Ken Kreutz-Delgado, Alexander Cloninger, Srinjoy Das

Abstract: Stochastic-sampling-based Generative Neural Networks, such as Restricted Boltzmann Machines and Generative Adversarial Networks, are now used for applications such as denoising, image occlusion removal, pattern completion, and motion synthesis. In scenarios which involve performing such inference tasks with these models, it is critical to determine metrics that allow for model selection and/or mai… ▽ More Stochastic-sampling-based Generative Neural Networks, such as Restricted Boltzmann Machines and Generative Adversarial Networks, are now used for applications such as denoising, image occlusion removal, pattern completion, and motion synthesis. In scenarios which involve performing such inference tasks with these models, it is critical to determine metrics that allow for model selection and/or maintenance of requisite generative performance under pre-specified implementation constraints. In this paper, we propose a new metric for evaluating generative model performance based on $p$-values derived from the combined use of Maximum Mean Discrepancy (MMD) and permutation-based (PT-based) resampling, which we refer to as PT-MMD. We demonstrate the effectiveness of this metric for two cases: (1) Selection of bitwidth and activation function complexity to achieve minimum power-at-performance for Restricted Boltzmann Machines; (2) Quantitative comparison of images generated by two types of Generative Adversarial Networks (PGAN and WGAN) to facilitate model selection in order to maximize the fidelity of generated images. For these applications, our results are shown using Euclidean and Haar-based kernels for the PT-MMD two sample hypothesis test. This demonstrates the critical role of distance functions in comparing generated images against their corresponding ground truth counterparts as what would be perceived by human users. △ Less

Submitted 28 October, 2019; originally announced October 2019.

Comments: Will be presented at the Asilomar Conference on Signals, Systems, and Computers

arXiv:1903.04659 [pdf, other]

AX-DBN: An Approximate Computing Framework for the Design of Low-Power Discriminative Deep Belief Networks

Authors: Ian Colbert, Ken Kreutz-Delgado, Srinjoy Das

Abstract: The power budget for embedded hardware implementations of Deep Learning algorithms can be extremely tight. To address implementation challenges in such domains, new design paradigms, like Approximate Computing, have drawn significant attention. Approximate Computing exploits the innate error-resilience of Deep Learning algorithms, a property that makes them amenable for deployment on low-power com… ▽ More The power budget for embedded hardware implementations of Deep Learning algorithms can be extremely tight. To address implementation challenges in such domains, new design paradigms, like Approximate Computing, have drawn significant attention. Approximate Computing exploits the innate error-resilience of Deep Learning algorithms, a property that makes them amenable for deployment on low-power computing platforms. This paper describes an Approximate Computing design methodology, AX-DBN, for an architecture belonging to the class of stochastic Deep Learning algorithms known as Deep Belief Networks (DBNs). Specifically, we consider procedures for efficiently implementing the Discriminative Deep Belief Network (DDBN), a stochastic neural network which is used for classification tasks, extending Approximation Computing from the analysis of deterministic to stochastic neural networks. For the purpose of optimizing the DDBN for hardware implementations, we explore the use of: (a)Limited precision of neurons and functional approximations of activation functions; (b) Criticality analysis to identify nodes in the network which can operate at reduced precision while allowing the network to maintain target accuracy levels; and (c) A greedy search methodology with incremental retraining to determine the optimal reduction in precision for all neurons to maximize power savings. Using the AX-DBN methodology proposed in this paper, we present experimental results across several network architectures that show significant power savings under a user-specified accuracy loss constraint with respect to ideal full precision implementations. △ Less

Submitted 26 March, 2019; v1 submitted 11 March, 2019; originally announced March 2019.

arXiv:1901.07915 [pdf, other]

doi 10.1016/j.neuroimage.2019.05.026

ICLabel: An automated electroencephalographic independent component classifier, dataset, and website

Authors: Luca Pion-Tonachini, Ken Kreutz-Delgado, Scott Makeig

Abstract: The electroencephalogram (EEG) provides a non-invasive, minimally restrictive, and relatively low cost measure of mesoscale brain dynamics with high temporal resolution. Although signals recorded in parallel by multiple, near-adjacent EEG scalp electrode channels are highly-correlated and combine signals from many different sources, biological and non-biological, independent component analysis (IC… ▽ More The electroencephalogram (EEG) provides a non-invasive, minimally restrictive, and relatively low cost measure of mesoscale brain dynamics with high temporal resolution. Although signals recorded in parallel by multiple, near-adjacent EEG scalp electrode channels are highly-correlated and combine signals from many different sources, biological and non-biological, independent component analysis (ICA) has been shown to isolate the various source generator processes underlying those recordings. Independent components (IC) found by ICA decomposition can be manually inspected, selected, and interpreted, but doing so requires both time and practice as ICs have no particular order or intrinsic interpretations and therefore require further study of their properties. Alternatively, sufficiently-accurate automated IC classifiers can be used to classify ICs into broad source categories, speeding the analysis of EEG studies with many subjects and enabling the use of ICA decomposition in near-real-time applications. While many such classifiers have been proposed recently, this work presents the ICLabel project comprised of (1) an IC dataset containing spatiotemporal measures for over 200,000 ICs from more than 6,000 EEG recordings, (2) a website for collecting crowdsourced IC labels and educating EEG researchers and practitioners about IC interpretation, and (3) the automated ICLabel classifier. The classifier improves upon existing methods in two ways: by improving the accuracy of the computed label estimates and by enhancing its computational efficiency. The ICLabel classifier outperforms or performs comparably to the previous best publicly available method for all measured IC categories while computing those labels ten times faster than that classifier as shown in a rigorous comparison against all other publicly available EEG IC classifiers. △ Less

Submitted 4 February, 2019; v1 submitted 22 January, 2019; originally announced January 2019.

Comments: Intended for NeuroImage. Updated from version one with minor editorial and figure changes

arXiv:1705.02583 [pdf, other]

A Design Methodology for Efficient Implementation of Deconvolutional Neural Networks on an FPGA

Authors: Xinyu Zhang, Srinjoy Das, Ojash Neopane, Ken Kreutz-Delgado

Abstract: In recent years deep learning algorithms have shown extremely high performance on machine learning tasks such as image classification and speech recognition. In support of such applications, various FPGA accelerator architectures have been proposed for convolutional neural networks (CNNs) that enable high performance for classification tasks at lower power than CPU and GPU processors. However, to… ▽ More In recent years deep learning algorithms have shown extremely high performance on machine learning tasks such as image classification and speech recognition. In support of such applications, various FPGA accelerator architectures have been proposed for convolutional neural networks (CNNs) that enable high performance for classification tasks at lower power than CPU and GPU processors. However, to date, there has been little research on the use of FPGA implementations of deconvolutional neural networks (DCNNs). DCNNs, also known as generative CNNs, encode high-dimensional probability distributions and have been widely used for computer vision applications such as scene completion, scene segmentation, image creation, image denoising, and super-resolution imaging. We propose an FPGA architecture for deconvolutional networks built around an accelerator which effectively handles the complex memory access patterns needed to perform strided deconvolutions, and that supports convolution as well. We also develop a three-step design optimization method that systematically exploits statistical analysis, design space exploration and VLSI optimization. To verify our FPGA deconvolutional accelerator design methodology we train DCNNs offline on two representative datasets using the generative adversarial network method (GAN) run on Tensorflow, and then map these DCNNs to an FPGA DCNN-plus-accelerator implementation to perform generative inference on a Xilinx Zynq-7000 FPGA. Our DCNN implementation achieves a peak performance density of 0.012 GOPs/DSP. △ Less

Submitted 7 May, 2017; originally announced May 2017.

arXiv:1704.03993 [pdf, other]

ApproxDBN: Approximate Computing for Discriminative Deep Belief Networks

Authors: Xiaojing Xu, Srinjoy Das, Ken Kreutz-Delgado

Abstract: Probabilistic generative neural networks are useful for many applications, such as image classification, speech recognition and occlusion removal. However, the power budget for hardware implementations of neural networks can be extremely tight. To address this challenge we describe a design methodology for using approximate computing methods to implement Approximate Deep Belief Networks (ApproxDBN… ▽ More Probabilistic generative neural networks are useful for many applications, such as image classification, speech recognition and occlusion removal. However, the power budget for hardware implementations of neural networks can be extremely tight. To address this challenge we describe a design methodology for using approximate computing methods to implement Approximate Deep Belief Networks (ApproxDBNs) by systematically exploring the use of (1) limited precision of variables; (2) criticality analysis to identify the nodes in the network which can operate with such limited precision while allowing the network to maintain target accuracy levels; and (3) a greedy search methodology with incremental retraining to determine the optimal reduction in precision to enable maximize power savings under user-specified accuracy constraints. Experimental results show that significant bit-length reduction can be achieved by our ApproxDBN with constrained accuracy loss. △ Less

Submitted 6 May, 2017; v1 submitted 13 April, 2017; originally announced April 2017.

Comments: 8 pages, 7 figures

arXiv:1602.05996 [pdf, other]

A Nonparametric Framework for Quantifying Generative Inference on Neuromorphic Systems

Authors: Ojash Neopane, Srinjoy Das, Ery Arias-Castro, Kenneth Kreutz-Delgado

Abstract: Restricted Boltzmann Machines and Deep Belief Networks have been successfully used in probabilistic generative model applications such as image occlusion removal, pattern completion and motion synthesis. Generative inference in such algorithms can be performed very efficiently on hardware using a Markov Chain Monte Carlo procedure called Gibbs sampling, where stochastic samples are drawn from nois… ▽ More Restricted Boltzmann Machines and Deep Belief Networks have been successfully used in probabilistic generative model applications such as image occlusion removal, pattern completion and motion synthesis. Generative inference in such algorithms can be performed very efficiently on hardware using a Markov Chain Monte Carlo procedure called Gibbs sampling, where stochastic samples are drawn from noisy integrate and fire neurons implemented on neuromorphic substrates. Currently, no satisfactory metrics exist for evaluating the generative performance of such algorithms implemented on high-dimensional data for neuromorphic platforms. This paper demonstrates the application of nonparametric goodness-of-fit testing to both quantify the generative performance as well as provide decision-directed criteria for choosing the parameters of the neuromorphic Gibbs sampler and optimizing usage of hardware resources used during sampling. △ Less

Submitted 18 February, 2016; originally announced February 2016.

Comments: Accepted for lecture presentation at ISCAS 2016

arXiv:1512.00156 [pdf, other]

Covariance-domain Dictionary Learning for Overcomplete EEG Source Identification

Authors: Ozgur Balkan, Kenneth Kreutz-Delgado, Scott Makeig

Abstract: We propose an algorithm targeting the identification of more sources than channels for electroencephalography (EEG). Our overcomplete source identification algorithm, Cov-DL, leverages dictionary learning methods applied in the covariance-domain. Assuming that EEG sources are uncorrelated within moving time-windows and the scalp mixing is linear, the forward problem can be transferred to the covar… ▽ More We propose an algorithm targeting the identification of more sources than channels for electroencephalography (EEG). Our overcomplete source identification algorithm, Cov-DL, leverages dictionary learning methods applied in the covariance-domain. Assuming that EEG sources are uncorrelated within moving time-windows and the scalp mixing is linear, the forward problem can be transferred to the covariance domain which has higher dimensionality than the original EEG channel domain. This allows for learning the overcomplete mixing matrix that generates the scalp EEG even when there may be more sources than sensors active at any time segment, i.e. when there are non-sparse sources. This is contrary to straight-forward dictionary learning methods that are based on the assumption of sparsity, which is not a satisfied condition in the case of low-density EEG systems. We present two different learning strategies for Cov-DL, determined by the size of the target mixing matrix. We demonstrate that Cov-DL outperforms existing overcomplete ICA algorithms under various scenarios of EEG simulations and real EEG experiments. △ Less

Submitted 1 December, 2015; originally announced December 2015.

arXiv:1509.07302 [pdf, other]

doi 10.1109/TBCAS.2016.2539352

Mapping Generative Models onto a Network of Digital Spiking Neurons

Authors: Bruno U. Pedroni, Srinjoy Das, John V. Arthur, Paul A. Merolla, Bryan L. Jackson, Dharmendra S. Modha, Kenneth Kreutz-Delgado, Gert Cauwenberghs

Abstract: Stochastic neural networks such as Restricted Boltzmann Machines (RBMs) have been successfully used in applications ranging from speech recognition to image classification. Inference and learning in these algorithms use a Markov Chain Monte Carlo procedure called Gibbs sampling, where a logistic function forms the kernel of this sampler. On the other side of the spectrum, neuromorphic systems have… ▽ More Stochastic neural networks such as Restricted Boltzmann Machines (RBMs) have been successfully used in applications ranging from speech recognition to image classification. Inference and learning in these algorithms use a Markov Chain Monte Carlo procedure called Gibbs sampling, where a logistic function forms the kernel of this sampler. On the other side of the spectrum, neuromorphic systems have shown great promise for low-power and parallelized cognitive computing, but lack well-suited applications and automation procedures. In this work, we propose a systematic method for bridging the RBM algorithm and digital neuromorphic systems, with a generative pattern completion task as proof of concept. For this, we first propose a method of producing the Gibbs sampler using bio-inspired digital noisy integrate-and-fire neurons. Next, we describe the process of mapping generative RBMs trained offline onto the IBM TrueNorth neurosynaptic processor -- a low-power digital neuromorphic VLSI substrate. Mapping these algorithms onto neuromorphic hardware presents unique challenges in network connectivity and weight and bias quantization, which, in turn, require architectural and design strategies for the physical realization. Generative performance metrics are analyzed to validate the neuromorphic requirements and to best select the neuron parameters for the model. Lastly, we describe a design automation procedure which achieves optimal resource usage, accounting for the novel hardware adaptations. This work represents the first implementation of generative RBM inference on a neuromorphic VLSI substrate. △ Less

Submitted 9 October, 2015; v1 submitted 24 September, 2015; originally announced September 2015.

Comments: A similar version of this manuscript has been submitted to IEEE TBioCAS for revision in October 2015

arXiv:1503.07793 [pdf, other]

Gibbs Sampling with Low-Power Spiking Digital Neurons

Authors: Srinjoy Das, Bruno Umbria Pedroni, Paul Merolla, John Arthur, Andrew S. Cassidy, Bryan L. Jackson, Dharmendra Modha, Gert Cauwenberghs, Ken Kreutz-Delgado

Abstract: Restricted Boltzmann Machines and Deep Belief Networks have been successfully used in a wide variety of applications including image classification and speech recognition. Inference and learning in these algorithms uses a Markov Chain Monte Carlo procedure called Gibbs sampling. A sigmoidal function forms the kernel of this sampler which can be realized from the firing statistics of noisy integrat… ▽ More Restricted Boltzmann Machines and Deep Belief Networks have been successfully used in a wide variety of applications including image classification and speech recognition. Inference and learning in these algorithms uses a Markov Chain Monte Carlo procedure called Gibbs sampling. A sigmoidal function forms the kernel of this sampler which can be realized from the firing statistics of noisy integrate-and-fire neurons on a neuromorphic VLSI substrate. This paper demonstrates such an implementation on an array of digital spiking neurons with stochastic leak and threshold properties for inference tasks and presents some key performance metrics for such a hardware-based sampler in both the generative and discriminative contexts. △ Less

Submitted 27 March, 2015; v1 submitted 26 March, 2015; originally announced March 2015.

Comments: Accepted at ISCAS 2015

arXiv:1311.0966 [pdf, other]

doi 10.3389/fnins.2013.00272

Event-Driven Contrastive Divergence for Spiking Neuromorphic Systems

Authors: Emre Neftci, Srinjoy Das, Bruno Pedroni, Kenneth Kreutz-Delgado, Gert Cauwenberghs

Abstract: Restricted Boltzmann Machines (RBMs) and Deep Belief Networks have been demonstrated to perform efficiently in a variety of applications, such as dimensionality reduction, feature learning, and classification. Their implementation on neuromorphic hardware platforms emulating large-scale networks of spiking neurons can have significant advantages from the perspectives of scalability, power dissipat… ▽ More Restricted Boltzmann Machines (RBMs) and Deep Belief Networks have been demonstrated to perform efficiently in a variety of applications, such as dimensionality reduction, feature learning, and classification. Their implementation on neuromorphic hardware platforms emulating large-scale networks of spiking neurons can have significant advantages from the perspectives of scalability, power dissipation and real-time interfacing with the environment. However the traditional RBM architecture and the commonly used training algorithm known as Contrastive Divergence (CD) are based on discrete updates and exact arithmetics which do not directly map onto a dynamical neural substrate. Here, we present an event-driven variation of CD to train a RBM constructed with Integrate & Fire (I&F) neurons, that is constrained by the limitations of existing and near future neuromorphic hardware platforms. Our strategy is based on neural sampling, which allows us to synthesize a spiking neural network that samples from a target Boltzmann distribution. The recurrent activity of the network replaces the discrete steps of the CD algorithm, while Spike Time Dependent Plasticity (STDP) carries out the weight updates in an online, asynchronous fashion. We demonstrate our approach by training an RBM composed of leaky I&F neurons with STDP synapses to learn a generative model of the MNIST hand-written digit dataset, and by testing it in recognition, generation and cue integration tasks. Our results contribute to a machine learning-driven approach for synthesizing networks of spiking neurons capable of carrying out practical, high-level functionality. △ Less

Submitted 9 December, 2013; v1 submitted 4 November, 2013; originally announced November 2013.

Comments: (Under review)

arXiv:1210.5975 [pdf, other]

Solid State Disk Object-Based Storage with Trim Commands

Authors: Tasha Frankie, Gordon Hughes, Ken Kreutz-Delgado

Abstract: This paper presents a model of NAND flash SSD utilization and write amplification when the ATA/ATAPI SSD Trim command is incorporated into object-based storage under a variety of user workloads, including a uniform random workload with objects of fixed size and a uniform random workload with objects of varying sizes. We first summarize the existing models for write amplification in SSDs for worklo… ▽ More This paper presents a model of NAND flash SSD utilization and write amplification when the ATA/ATAPI SSD Trim command is incorporated into object-based storage under a variety of user workloads, including a uniform random workload with objects of fixed size and a uniform random workload with objects of varying sizes. We first summarize the existing models for write amplification in SSDs for workloads with and without the Trim command, then propose an alteration of the models that utilizes a framework of object-based storage. The utilization of objects and pages in the SSD is derived, with the analytic results compared to simulation. Finally, the effect of objects on write amplification and its computation is discussed along with a potential application to optimization of SSD usage through object storage metadata servers that allocate object classes of distinct object size. △ Less

Submitted 10 October, 2012; originally announced October 2012.

arXiv:1208.1794 [pdf, ps, other]

Analysis of Trim Commands on Overprovisioning and Write Amplification in Solid State Drives

Authors: Tasha Frankie, Gordon Hughes, Ken Kreutz-Delgado

Abstract: This paper presents a performance model of the ATA/ATAPI SSD Trim command under various types of user workloads, including a uniform random workload, a workload with hot and cold data, and a workload with N temperatures of data. We first examine the Trim-modified uniform random workload to predict utilization, then use this result to compute the resultant level of effective overprovisioning. This… ▽ More This paper presents a performance model of the ATA/ATAPI SSD Trim command under various types of user workloads, including a uniform random workload, a workload with hot and cold data, and a workload with N temperatures of data. We first examine the Trim-modified uniform random workload to predict utilization, then use this result to compute the resultant level of effective overprovisioning. This allows modification of models previously suggested to predict write amplification of a non-Trim uniform random workload under greedy garbage collection. Finally, we expand the theory to cover a workload consisting of hot and cold data (and also N temperatures of data), providing formulas to predict write amplification in these scenarios. △ Less

Submitted 8 August, 2012; originally announced August 2012.

Showing 1–18 of 18 results for author: Kreutz-Delgado, K