Skip to main content

Showing 1–7 of 7 results for author: Moskewicz, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:1811.07070  [pdf, other

    cs.CV

    DSCnet: Replicating Lidar Point Clouds with Deep Sensor Cloning

    Authors: Paden Tomasello, Sammy Sidhu, Anting Shen, Matthew W. Moskewicz, Nobie Redmon, Gayatri Joshi, Romi Phadte, Paras Jain, Forrest Iandola

    Abstract: Convolutional neural networks (CNNs) have become increasingly popular for solving a variety of computer vision tasks, ranging from image classification to image segmentation. Recently, autonomous vehicles have created a demand for depth information, which is often obtained using hardware sensors such as Light detection and ranging (LIDAR). Although it can provide precise distance measurements, mos… ▽ More

    Submitted 26 November, 2018; v1 submitted 16 November, 2018; originally announced November 2018.

    Comments: V2

  2. arXiv:1611.06945  [pdf, other

    cs.NE cs.DC cs.MS

    A Metaprogramming and Autotuning Framework for Deploying Deep Learning Applications

    Authors: Matthew W. Moskewicz, Ali Jannesari, Kurt Keutzer

    Abstract: In recent years, deep neural networks (DNNs), have yielded strong results on a wide range of applications. Graphics Processing Units (GPUs) have been one key enabling factor leading to the current popularity of DNNs. However, despite increasing hardware flexibility and software programming toolchain maturity, high efficiency GPU programming remains difficult: it suffers from high complexity, low p… ▽ More

    Submitted 21 November, 2016; originally announced November 2016.

  3. arXiv:1606.01561  [pdf, other

    cs.CV

    Shallow Networks for High-Accuracy Road Object-Detection

    Authors: Khalid Ashraf, Bichen Wu, Forrest N. Iandola, Mattthew W. Moskewicz, Kurt Keutzer

    Abstract: The ability to automatically detect other vehicles on the road is vital to the safety of partially-autonomous and fully-autonomous vehicles. Most of the high-accuracy techniques for this task are based on R-CNN or one of its faster variants. In the research community, much emphasis has been applied to using 3D vision or complex R-CNN variants to achieve higher accuracy. However, are there more str… ▽ More

    Submitted 5 June, 2016; originally announced June 2016.

    Comments: 9 pages, 5 figures

  4. arXiv:1606.00094  [pdf, other

    cs.DC cs.MS cs.NE

    Boda-RTC: Productive Generation of Portable, Efficient Code for Convolutional Neural Networks on Mobile Computing Platforms

    Authors: Matthew Moskewicz, Forrest Iandola, Kurt Keutzer

    Abstract: The popularity of neural networks (NNs) spans academia, industry, and popular culture. In particular, convolutional neural networks (CNNs) have been applied to many image based machine learning tasks and have yielded strong results. The availability of hardware/software systems for efficient training and deployment of large and/or deep CNN models has been, and continues to be, an important conside… ▽ More

    Submitted 13 September, 2016; v1 submitted 31 May, 2016; originally announced June 2016.

  5. arXiv:1602.07360  [pdf, other

    cs.CV cs.AI

    SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <0.5MB model size

    Authors: Forrest N. Iandola, Song Han, Matthew W. Moskewicz, Khalid Ashraf, William J. Dally, Kurt Keutzer

    Abstract: Recent research on deep neural networks has focused primarily on improving accuracy. For a given accuracy level, it is typically possible to identify multiple DNN architectures that achieve that accuracy level. With equivalent accuracy, smaller DNN architectures offer at least three advantages: (1) Smaller DNNs require less communication across servers during distributed training. (2) Smaller DNNs… ▽ More

    Submitted 4 November, 2016; v1 submitted 23 February, 2016; originally announced February 2016.

    Comments: In ICLR Format

  6. arXiv:1511.00175  [pdf, other

    cs.CV

    FireCaffe: near-linear acceleration of deep neural network training on compute clusters

    Authors: Forrest N. Iandola, Khalid Ashraf, Matthew W. Moskewicz, Kurt Keutzer

    Abstract: Long training times for high-accuracy deep neural networks (DNNs) impede research into new DNN architectures and slow the development of high-accuracy DNNs. In this paper we present FireCaffe, which successfully scales deep neural network training across a cluster of GPUs. We also present a number of best practices to aid in comparing advancements in methods for scaling and accelerating the traini… ▽ More

    Submitted 8 January, 2016; v1 submitted 31 October, 2015; originally announced November 2015.

    Comments: Version 2: Added results on 128 GPUs

  7. arXiv:1404.1869  [pdf, other

    cs.CV

    DenseNet: Implementing Efficient ConvNet Descriptor Pyramids

    Authors: Forrest Iandola, Matt Moskewicz, Sergey Karayev, Ross Girshick, Trevor Darrell, Kurt Keutzer

    Abstract: Convolutional Neural Networks (CNNs) can provide accurate object classification. They can be extended to perform object detection by iterating over dense or selected proposed object regions. However, the runtime of such detectors scales as the total number and/or area of regions to examine per image, and training such detectors may be prohibitively slow. However, for some CNN classifier topologies… ▽ More

    Submitted 7 April, 2014; originally announced April 2014.