Skip to main content

Showing 1–6 of 6 results for author: Lim, J P

Searching in archive cs. Search in all archives.
.
  1. arXiv:2501.06246  [pdf, other

    cs.CL cs.AI cs.DS

    A partition cover approach to tokenization

    Authors: Jia Peng Lim, Shawn Tan, Davin Choo, Hady W. Lauw

    Abstract: Tokenization is the process of encoding strings into tokens of a fixed vocabulary size, and is widely utilized in Natural Language Processing applications. The leading tokenization algorithm today is Byte-Pair Encoding (BPE), which formulates the tokenization problem as a compression problem and tackles it by performing sequences of merges. In this work, we formulate tokenization as an optimizatio… ▽ More

    Submitted 25 May, 2025; v1 submitted 8 January, 2025; originally announced January 2025.

    Comments: under review

  2. arXiv:2210.02349  [pdf, other

    eess.IV cs.CV cs.LG q-bio.NC

    Fitting a Directional Microstructure Model to Diffusion-Relaxation MRI Data with Self-Supervised Machine Learning

    Authors: Jason P. Lim, Stefano B. Blumberg, Neil Narayan, Sean C. Epstein, Daniel C. Alexander, Marco Palombo, Paddy J. Slator

    Abstract: Machine learning is a powerful approach for fitting microstructural models to diffusion MRI data. Early machine learning microstructure imaging implementations trained regressors to estimate model parameters in a supervised way, using synthetic training data with known ground truth. However, a drawback of this approach is that the choice of training data impacts fitted parameter values. Self-super… ▽ More

    Submitted 5 October, 2022; originally announced October 2022.

    Comments: Oral Presentation in: Computational Diffusion MRI Workshop (CDMRI) at Medical Image Computing and Computer Assisted Intervention (MICCAI) 2022

  3. arXiv:2111.12852  [pdf, other

    cs.MS

    RLIBM-PROG: Progressive Polynomial Approximations for Fast Correctly Rounded Math Libraries

    Authors: Mridul Aanjaneya, Jay P. Lim, Santosh Nagarakatte

    Abstract: This paper presents a novel method for generating a single polynomial approximation that produces correctly rounded results for all inputs of an elementary function for multiple representations. The generated polynomial approximation has the nice property that the first few lower degree terms produce correctly rounded results for specific representations of smaller bitwidths, which we call progres… ▽ More

    Submitted 17 March, 2022; v1 submitted 24 November, 2021; originally announced November 2021.

    Comments: 14 pages

    Report number: Rutgers Department of Computer Science Technical Report DCS-TR-758

  4. arXiv:2108.06756  [pdf, other

    cs.MS

    RLIBM-ALL: A Novel Polynomial Approximation Method to Produce Correctly Rounded Results for Multiple Representations and Rounding Modes

    Authors: Jay P. Lim, Santosh Nagarakatte

    Abstract: Mainstream math libraries for floating point (FP) do not produce correctly rounded results for all inputs. In contrast, CR-LIBM and RLIBM provide correctly rounded implementations for a specific FP representation with one rounding mode. Using such libraries for a representation with a new rounding mode or with different precision will result in wrong results due to double rounding. This paper prop… ▽ More

    Submitted 29 November, 2021; v1 submitted 15 August, 2021; originally announced August 2021.

    Comments: 28 pages

    Report number: Rutgers Department of Computer Science Technical Report DCS-TR-757

  5. arXiv:2104.04043  [pdf, other

    cs.MS

    RLIBM-32: High Performance Correctly Rounded Math Libraries for 32-bit Floating Point Representations

    Authors: Jay P. Lim, Santosh Nagarakatte

    Abstract: This paper proposes a set of techniques to develop correctly rounded math libraries for 32-bit float and posit types. It enhances our RLibm approach that frames the problem of generating correctly rounded libraries as a linear programming problem in the context of 16-bit types to scale to 32-bit types. Specifically, this paper proposes new algorithms to (1) generate polynomials that produce correc… ▽ More

    Submitted 8 April, 2021; originally announced April 2021.

    Comments: 23 pages

    Report number: Rutgers Department of Computer Science Technical Report DCS-TR-754

  6. arXiv:2007.05344  [pdf, other

    cs.MS

    A Novel Approach to Generate Correctly Rounded Math Libraries for New Floating Point Representations

    Authors: Jay P. Lim, Mridul Aanjaneya, John Gustafson, Santosh Nagarakatte

    Abstract: Given the importance of floating-point~(FP) performance in numerous domains, several new variants of FP and its alternatives have been proposed (e.g., Bfloat16, TensorFloat32, and Posits). These representations do not have correctly rounded math libraries. Further, the use of existing FP libraries for these new representations can produce incorrect results. This paper proposes a novel approach for… ▽ More

    Submitted 20 November, 2020; v1 submitted 9 July, 2020; originally announced July 2020.

    Comments: 44 pages

    Report number: Rutgers DCS Technical Report 753