Skip to main content

Showing 1–1 of 1 results for author: Mo, J C

Searching in archive cs. Search in all archives.
.
  1. arXiv:2412.01042  [pdf, other

    cs.CR cs.LG

    TruncFormer: Private LLM Inference Using Only Truncations

    Authors: Patrick Yubeaton, Jianqiao Cambridge Mo, Karthik Garimella, Nandan Kumar Jha, Brandon Reagen, Chinmay Hegde, Siddharth Garg

    Abstract: Private inference (PI) serves an important role in guaranteeing the privacy of user data when interfacing with proprietary machine learning models such as LLMs. However, PI remains practically intractable due to the massive latency costs associated with nonlinear functions present in LLMs. Existing works have focused on improving latency of specific LLM nonlinearities (such as the Softmax, or the… ▽ More

    Submitted 1 December, 2024; originally announced December 2024.