Skip to main content

Showing 1–16 of 16 results for author: Park, S M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2506.15021  [pdf, ps, other

    cs.LG cs.AI

    SFT-GO: Supervised Fine-Tuning with Group Optimization for Large Language Models

    Authors: Gyuhak Kim, Sumiran Singh Thakur, Su Min Park, Wei Wei, Yujia Bao

    Abstract: Supervised fine-tuning (SFT) has become an essential step in tailoring large language models (LLMs) to align with human expectations and specific downstream tasks. However, existing SFT methods typically treat each training instance as a uniform sequence, giving equal importance to all tokens regardless of their relevance. This overlooks the fact that only a subset of tokens often contains critica… ▽ More

    Submitted 17 June, 2025; originally announced June 2025.

  2. arXiv:2410.23232  [pdf, other

    cs.LG

    Attribute-to-Delete: Machine Unlearning via Datamodel Matching

    Authors: Kristian Georgiev, Roy Rinberg, Sung Min Park, Shivam Garg, Andrew Ilyas, Aleksander Madry, Seth Neel

    Abstract: Machine unlearning -- efficiently removing the effect of a small "forget set" of training data on a pre-trained machine learning model -- has recently attracted significant research interest. Despite this interest, however, recent work shows that existing machine unlearning techniques do not hold up to thorough evaluation in non-convex settings. In this work, we introduce a new machine unlearning… ▽ More

    Submitted 11 November, 2024; v1 submitted 30 October, 2024; originally announced October 2024.

  3. arXiv:2408.05917  [pdf

    cs.CE cs.AI cs.LG

    Inverse design of Non-parameterized Ventilated Acoustic Resonator via Variational Autoencoder with Acoustic Response-encoded Latent Space

    Authors: Min Woo Cho, Seok Hyeon Hwang, Jun-Young Jang, Jin Yeong Song, Sun-kwang Hwang, Kyoung Je Cha, Dong Yong Park, Kyungjun Song, Sang Min Park

    Abstract: Ventilated acoustic resonator(VAR), a type of acoustic metamaterial, emerge as an alternative for sound attenuation in environments that require ventilation, owing to its excellent low-frequency attenuation performance and flexible shape adaptability. However, due to the non-linear acoustic responses of VARs, the VAR designs are generally obtained within a limited parametrized design space, and th… ▽ More

    Submitted 12 August, 2024; originally announced August 2024.

  4. arXiv:2406.06559  [pdf, other

    cs.CL cs.AI cs.LG

    Harnessing Business and Media Insights with Large Language Models

    Authors: Yujia Bao, Ankit Parag Shah, Neeru Narang, Jonathan Rivers, Rajeev Maksey, Lan Guan, Louise N. Barrere, Shelley Evenson, Rahul Basole, Connie Miao, Ankit Mehta, Fabien Boulay, Su Min Park, Natalie E. Pearson, Eldhose Joy, Tiger He, Sumiran Thakur, Koustav Ghosal, Josh On, Phoebe Morrison, Tim Major, Eva Siqi Wang, Gina Escobar, Jiaheng Wei, Tharindu Cyril Weerasooriya , et al. (8 additional authors not shown)

    Abstract: This paper introduces Fortune Analytics Language Model (FALM). FALM empowers users with direct access to comprehensive business analysis, including market trends, company performance metrics, and expert insights. Unlike generic LLMs, FALM leverages a curated knowledge base built from professional journalism, enabling it to deliver precise and in-depth answers to intricate business questions. Users… ▽ More

    Submitted 2 June, 2024; originally announced June 2024.

  5. arXiv:2312.06205  [pdf, other

    cs.CV cs.LG

    The Journey, Not the Destination: How Data Guides Diffusion Models

    Authors: Kristian Georgiev, Joshua Vendrow, Hadi Salman, Sung Min Park, Aleksander Madry

    Abstract: Diffusion models trained on large datasets can synthesize photo-realistic images of remarkable quality and diversity. However, attributing these images back to the training data-that is, identifying specific training examples which caused an image to be generated-remains a challenge. In this paper, we propose a framework that: (i) provides a formal notion of data attribution in the context of diff… ▽ More

    Submitted 11 December, 2023; originally announced December 2023.

    Comments: 29 pages, 17 figures

  6. arXiv:2308.04470  [pdf

    cs.NE cs.LG

    D-Score: A Synapse-Inspired Approach for Filter Pruning

    Authors: Doyoung Park, Jinsoo Kim, Jina Nam, Jooyoung Chang, Sang Min Park

    Abstract: This paper introduces a new aspect for determining the rank of the unimportant filters for filter pruning on convolutional neural networks (CNNs). In the human synaptic system, there are two important channels known as excitatory and inhibitory neurotransmitters that transmit a signal from a neuron to a cell. Adopting the neuroscientific perspective, we propose a synapse-inspired filter pruning me… ▽ More

    Submitted 8 August, 2023; originally announced August 2023.

    Comments: 9 pages, 5 figures, 2 tables

  7. arXiv:2306.12517  [pdf, other

    cs.LG cs.CV

    FFCV: Accelerating Training by Removing Data Bottlenecks

    Authors: Guillaume Leclerc, Andrew Ilyas, Logan Engstrom, Sung Min Park, Hadi Salman, Aleksander Madry

    Abstract: We present FFCV, a library for easy and fast machine learning model training. FFCV speeds up model training by eliminating (often subtle) data bottlenecks from the training process. In particular, we combine techniques such as an efficient file storage format, caching, data pre-loading, asynchronous data transfer, and just-in-time compilation to (a) make data loading and transfer significantly mor… ▽ More

    Submitted 21 June, 2023; originally announced June 2023.

  8. arXiv:2303.16205  [pdf

    eess.IV cs.LG physics.optics

    mHealth hyperspectral learning for instantaneous spatiospectral imaging of hemodynamics

    Authors: Yuhyun Ji, Sang Mok Park, Semin Kwon, Jung Woo Leem, Vidhya Vijayakrishnan Nair, Yunjie Tong, Young L. Kim

    Abstract: Hyperspectral imaging acquires data in both the spatial and frequency domains to offer abundant physical or biological information. However, conventional hyperspectral imaging has intrinsic limitations of bulky instruments, slow data acquisition rate, and spatiospectral tradeoff. Here we introduce hyperspectral learning for snapshot hyperspectral imaging in which sampled hyperspectral data in a sm… ▽ More

    Submitted 5 April, 2023; v1 submitted 27 March, 2023; originally announced March 2023.

    Journal ref: PNAS Nexus, pgad111, 2023

  9. arXiv:2303.14186  [pdf, other

    stat.ML cs.LG

    TRAK: Attributing Model Behavior at Scale

    Authors: Sung Min Park, Kristian Georgiev, Andrew Ilyas, Guillaume Leclerc, Aleksander Madry

    Abstract: The goal of data attribution is to trace model predictions back to training data. Despite a long line of work towards this goal, existing approaches to data attribution tend to force users to choose between computational tractability and efficacy. That is, computationally tractable methods can struggle with accurately attributing model predictions in non-convex settings (e.g., in the context of de… ▽ More

    Submitted 3 April, 2023; v1 submitted 24 March, 2023; originally announced March 2023.

  10. arXiv:2211.12491  [pdf, other

    cs.LG cs.CV stat.ML

    ModelDiff: A Framework for Comparing Learning Algorithms

    Authors: Harshay Shah, Sung Min Park, Andrew Ilyas, Aleksander Madry

    Abstract: We study the problem of (learning) algorithm comparison, where the goal is to find differences between models trained with two different learning algorithms. We begin by formalizing this goal as one of finding distinguishing feature transformations, i.e., input transformations that change the predictions of models trained with one learning algorithm but not the other. We then present ModelDiff, a… ▽ More

    Submitted 22 November, 2022; originally announced November 2022.

  11. arXiv:2207.05739  [pdf, other

    cs.LG

    A Data-Based Perspective on Transfer Learning

    Authors: Saachi Jain, Hadi Salman, Alaa Khaddaj, Eric Wong, Sung Min Park, Aleksander Madry

    Abstract: It is commonly believed that in transfer learning including more pre-training data translates into better performance. However, recent evidence suggests that removing data from the source dataset can actually help too. In this work, we take a closer look at the role of the source dataset's composition in transfer learning and present a framework for probing its impact on downstream performance. Ou… ▽ More

    Submitted 12 July, 2022; originally announced July 2022.

  12. arXiv:2206.00384  [pdf, other

    cs.CV cs.LG

    Generalized Supervised Contrastive Learning

    Authors: Jaewon Kim, Hyukjong Lee, Jooyoung Chang, Sang Min Park

    Abstract: With the recent promising results of contrastive learning in the self-supervised learning paradigm, supervised contrastive learning has successfully extended these contrastive approaches to supervised contexts, outperforming cross-entropy on various datasets. However, supervised contrastive learning inherently employs label information in a binary form--either positive or negative--using a one-hot… ▽ More

    Submitted 21 May, 2023; v1 submitted 1 June, 2022; originally announced June 2022.

  13. arXiv:2202.00622  [pdf, other

    stat.ML cs.CV cs.LG

    Datamodels: Predicting Predictions from Training Data

    Authors: Andrew Ilyas, Sung Min Park, Logan Engstrom, Guillaume Leclerc, Aleksander Madry

    Abstract: We present a conceptual framework, datamodeling, for analyzing the behavior of a model class in terms of the training data. For any fixed "target" example $x$, training set $S$, and learning algorithm, a datamodel is a parameterized function $2^S \to \mathbb{R}$ that for any subset of $S' \subset S$ -- using only information about which examples of $S$ are contained in $S'$ -- predicts the outcome… ▽ More

    Submitted 1 February, 2022; originally announced February 2022.

  14. arXiv:2112.15329  [pdf, other

    cs.LG cs.CV

    On Distinctive Properties of Universal Perturbations

    Authors: Sung Min Park, Kuo-An Wei, Kai Xiao, Jerry Li, Aleksander Madry

    Abstract: We identify properties of universal adversarial perturbations (UAPs) that distinguish them from standard adversarial perturbations. Specifically, we show that targeted UAPs generated by projected gradient descent exhibit two human-aligned properties: semantic locality and spatial invariance, which standard targeted adversarial perturbations lack. We also demonstrate that UAPs contain significantly… ▽ More

    Submitted 31 December, 2021; originally announced December 2021.

  15. arXiv:1811.10106  [pdf, other

    math.ST cs.LG stat.ML

    Sparse PCA from Sparse Linear Regression

    Authors: Guy Bresler, Sung Min Park, Madalina Persu

    Abstract: Sparse Principal Component Analysis (SPCA) and Sparse Linear Regression (SLR) have a wide range of applications and have attracted a tremendous amount of attention in the last two decades as canonical examples of statistical problems in high dimension. A variety of algorithms have been proposed for both SPCA and SLR, but an explicit connection between the two had not been made. We show how to effi… ▽ More

    Submitted 25 November, 2018; originally announced November 2018.

    Comments: To appear in NeurIPS'18

  16. arXiv:1711.04500  [pdf

    cs.CY

    A Case Study of the 2016 Korean Cyber Command Compromise

    Authors: Kyong Jae Park, Sung Mi Park, Joshua I. James

    Abstract: On October 2016 the South Korean cyber military unit was the victim of a successful cyber attack that allowed access to internal networks. Per usual with large scale attacks against South Korean entities, the hack was immediately attributed to North Korea. Also, per other large-scale cyber security incidents, the same types of 'evidence' were used for attribution purposes. Disclosed methods of att… ▽ More

    Submitted 13 November, 2017; originally announced November 2017.

    Journal ref: European Conference on Information Warfare and Security, ECCWS. p.315-321 (2017)