Skip to main content

Showing 1–8 of 8 results for author: Aamodt, T M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2506.13814  [pdf, ps, other

    cs.GR cs.LG eess.IV

    ReFrame: Layer Caching for Accelerated Inference in Real-Time Rendering

    Authors: Lufei Liu, Tor M. Aamodt

    Abstract: Graphics rendering applications increasingly leverage neural networks in tasks such as denoising, supersampling, and frame extrapolation to improve image quality while maintaining frame rates. The temporal coherence inherent in these tasks presents an opportunity to reuse intermediate results from previous frames and avoid redundant computations. Recent work has shown that caching intermediate fea… ▽ More

    Submitted 14 June, 2025; originally announced June 2025.

    Comments: Published at ICML 2025

  2. arXiv:2303.02273  [pdf, other

    cs.LG cs.CV

    Learning Label Encodings for Deep Regression

    Authors: Deval Shah, Tor M. Aamodt

    Abstract: Deep regression networks are widely used to tackle the problem of predicting a continuous value for a given input. Task-specialized approaches for training regression networks have shown significant improvement over generic approaches, such as direct regression. More recently, a generic approach based on regression by binary classification using binary-encoded labels has shown significant improvem… ▽ More

    Submitted 3 March, 2023; originally announced March 2023.

    Comments: Published at ICLR 2023 (Notable top-25%)

    Journal ref: International Conference on Learning Representations 2023 (https://openreview.net/pdf?id=k60XE_b0Ix6)

  3. arXiv:2212.01927  [pdf, other

    cs.LG cs.CV

    Label Encoding for Regression Networks

    Authors: Deval Shah, Zi Yu Xue, Tor M. Aamodt

    Abstract: Deep neural networks are used for a wide range of regression problems. However, there exists a significant gap in accuracy between specialized approaches and generic direct regression in which a network is trained by minimizing the squared or absolute error of output labels. Prior work has shown that solving a regression problem with a set of binary classifiers can improve accuracy by utilizing we… ▽ More

    Submitted 4 December, 2022; originally announced December 2022.

    Comments: Published at ICLR 2022

    Journal ref: International Conference on Learning Representations 2022 (https://openreview.net/pdf?id=8WawVDdKqlL)

  4. arXiv:2110.08906  [pdf, other

    cs.AR cs.RO

    Characterizing and Improving the Resilience of Accelerators in Autonomous Robots

    Authors: Deval Shah, Zi Yu Xue, Karthik Pattabiraman, Tor M. Aamodt

    Abstract: Motion planning is a computationally intensive and well-studied problem in autonomous robots. However, motion planning hardware accelerators (MPA) must be soft-error resilient for deployment in safety-critical applications, and blanket application of traditional mitigation techniques is ill-suited due to cost, power, and performance overheads. We propose Collision Exposure Factor (CEF), a novel me… ▽ More

    Submitted 17 October, 2021; originally announced October 2021.

    Comments: 14 pages

    ACM Class: B.8.1; B.5.3

  5. arXiv:2001.01969  [pdf, other

    cs.LG stat.ML

    Sparse Weight Activation Training

    Authors: Md Aamir Raihan, Tor M. Aamodt

    Abstract: Neural network training is computationally and memory intensive. Sparse training can reduce the burden on emerging hardware platforms designed to accelerate sparse computations, but it can affect network convergence. In this work, we propose a novel CNN training algorithm Sparse Weight Activation Training (SWAT). SWAT is more computation and memory-efficient than conventional training. SWAT modifi… ▽ More

    Submitted 31 October, 2020; v1 submitted 7 January, 2020; originally announced January 2020.

    Comments: Published at NeurIPS 2020

  6. arXiv:1903.06658  [pdf, other

    cs.GR cs.MM eess.IV

    Surface Compression Using Dynamic Color Palettes

    Authors: Ayub A. Gubran, Felix Huang, Tor M. Aamodt

    Abstract: Off-chip memory traffic is a major source of power and energy consumption on mobile platforms. A large amount of this off-chip traffic is used to manipulate graphics framebuffer surfaces. To cut down the cost of accessing off-chip memory, framebuffer surfaces are compressed to reduce the bandwidth consumed on surface manipulation when rendering or displaying. In this work, we study the compressi… ▽ More

    Submitted 18 January, 2019; originally announced March 2019.

    Comments: 13 pages, 18 figures

  7. arXiv:1701.03878  [pdf, other

    cs.AR

    HoLiSwap: Reducing Wire Energy in L1 Caches

    Authors: Yatish Turakhia, Subhasis Das, Tor M. Aamodt, William J. Dally

    Abstract: This paper describes HoLiSwap a method to reduce L1 cache wire energy, a significant fraction of total cache energy, by swapping hot lines to the cache way nearest to the processor. We observe that (i) a small fraction (<3%) of cache lines (hot lines) serve over 60% of the L1 cache accesses and (ii) the difference in wire energy between the nearest and farthest cache subarray can be over 6… ▽ More

    Submitted 13 January, 2017; originally announced January 2017.

  8. arXiv:1606.01607  [pdf, ps, other

    cs.AR cs.PF

    CG-OoO: Energy-Efficient Coarse-Grain Out-of-Order Execution

    Authors: Milad Mohammadi, Tor M. Aamodt, William J. Dally

    Abstract: We introduce the Coarse-Grain Out-of-Order (CG- OoO) general purpose processor designed to achieve close to In-Order processor energy while maintaining Out-of-Order (OoO) performance. CG-OoO is an energy-performance proportional general purpose architecture that scales according to the program load. Block-level code processing is at the heart of the this architecture; CG-OoO speculates, fetches, s… ▽ More

    Submitted 5 June, 2016; originally announced June 2016.

    Comments: 11 pages