Skip to main content

Showing 1–3 of 3 results for author: Wilkening, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2105.08820  [pdf, other

    cs.AR cs.AI cs.DC

    RecPipe: Co-designing Models and Hardware to Jointly Optimize Recommendation Quality and Performance

    Authors: Udit Gupta, Samuel Hsia, Jeff Zhang, Mark Wilkening, Javin Pombra, Hsien-Hsin S. Lee, Gu-Yeon Wei, Carole-Jean Wu, David Brooks

    Abstract: Deep learning recommendation systems must provide high quality, personalized content under strict tail-latency targets and high system loads. This paper presents RecPipe, a system to jointly optimize recommendation quality and inference performance. Central to RecPipe is decomposing recommendation models into multi-stage pipelines to maintain quality while reducing compute complexity and exposing… ▽ More

    Submitted 22 May, 2021; v1 submitted 18 May, 2021; originally announced May 2021.

  2. arXiv:2102.00075  [pdf, other

    cs.AR cs.LG

    RecSSD: Near Data Processing for Solid State Drive Based Recommendation Inference

    Authors: Mark Wilkening, Udit Gupta, Samuel Hsia, Caroline Trippel, Carole-Jean Wu, David Brooks, Gu-Yeon Wei

    Abstract: Neural personalized recommendation models are used across a wide variety of datacenter applications including search, social media, and entertainment. State-of-the-art models comprise large embedding tables that have billions of parameters requiring large memory capacities. Unfortunately, large and fast DRAM-based memories levy high infrastructure costs. Conventional SSD-based storage solutions of… ▽ More

    Submitted 29 January, 2021; originally announced February 2021.

  3. arXiv:2010.05037  [pdf, other

    cs.AR cs.DC cs.IR

    Cross-Stack Workload Characterization of Deep Recommendation Systems

    Authors: Samuel Hsia, Udit Gupta, Mark Wilkening, Carole-Jean Wu, Gu-Yeon Wei, David Brooks

    Abstract: Deep learning based recommendation systems form the backbone of most personalized cloud services. Though the computer architecture community has recently started to take notice of deep recommendation inference, the resulting solutions have taken wildly different approaches - ranging from near memory processing to at-scale optimizations. To better design future hardware systems for deep recommendat… ▽ More

    Submitted 10 October, 2020; originally announced October 2020.

    Comments: Published in 2020 IEEE International Symposium on Workload Characterization (IISWC)