Skip to main content

Showing 1–14 of 14 results for author: Lee, D B

Searching in archive cs. Search in all archives.
.
  1. arXiv:2506.00910  [pdf, other

    cs.LG cs.AI

    PCoreSet: Effective Active Learning through Knowledge Distillation from Vision-Language Models

    Authors: Seongjae Kang, Dong Bok Lee, Hyungjoon Jang, Dongseop Kim, Sung Ju Hwang

    Abstract: Knowledge distillation (KD) is a widely used framework for training compact, task-specific models by leveraging the knowledge of teacher models. However, its application to active learning (AL), which aims to minimize annotation costs through iterative sample selection, remains underexplored. This gap stems from the fact that KD typically assumes access to sufficient labeled data, whereas AL opera… ▽ More

    Submitted 1 June, 2025; originally announced June 2025.

    Comments: 35 pages, 30 figures

  2. arXiv:2505.23032  [pdf, ps, other

    cs.LG cs.AI

    Bayesian Neural Scaling Law Extrapolation with Prior-Data Fitted Networks

    Authors: Dongwoo Lee, Dong Bok Lee, Steven Adriaensen, Juho Lee, Sung Ju Hwang, Frank Hutter, Seon Joo Kim, Hae Beom Lee

    Abstract: Scaling has been a major driver of recent advancements in deep learning. Numerous empirical studies have found that scaling laws often follow the power-law and proposed several variants of power-law functions to predict the scaling behavior at larger scales. However, existing methods mostly rely on point estimation and do not quantify uncertainty, which is crucial for real-world applications invol… ▽ More

    Submitted 15 June, 2025; v1 submitted 28 May, 2025; originally announced May 2025.

    Comments: Accepted to ICML 2025

  3. arXiv:2505.12805  [pdf, other

    cs.LG cs.AI

    FedSVD: Adaptive Orthogonalization for Private Federated Learning with LoRA

    Authors: Seanie Lee, Sangwoo Park, Dong Bok Lee, Dominik Wagner, Haebin Seong, Tobias Bocklet, Juho Lee, Sung Ju Hwang

    Abstract: Low-Rank Adaptation (LoRA), which introduces a product of two trainable low-rank matrices into frozen pre-trained weights, is widely used for efficient fine-tuning of language models in federated learning (FL). However, when combined with differentially private stochastic gradient descent (DP-SGD), LoRA faces substantial noise amplification: DP-SGD perturbs per-sample gradients, and the matrix mul… ▽ More

    Submitted 19 May, 2025; originally announced May 2025.

    Comments: preprint

  4. arXiv:2505.07675  [pdf, other

    cs.LG cs.AI cs.CV

    Simple Semi-supervised Knowledge Distillation from Vision-Language Models via $\mathbf{\texttt{D}}$ual-$\mathbf{\texttt{H}}$ead $\mathbf{\texttt{O}}$ptimization

    Authors: Seongjae Kang, Dong Bok Lee, Hyungjoon Jang, Sung Ju Hwang

    Abstract: Vision-language models (VLMs) have achieved remarkable success across diverse tasks by leveraging rich textual information with minimal labeled data. However, deploying such large models remains challenging, particularly in resource-constrained environments. Knowledge distillation (KD) offers a well-established solution to this problem; however, recent KD approaches from VLMs often involve multi-s… ▽ More

    Submitted 12 May, 2025; originally announced May 2025.

    Comments: 41 pages, 19 figures, preprint

  5. arXiv:2502.12464  [pdf, other

    cs.CL

    SafeRoute: Adaptive Model Selection for Efficient and Accurate Safety Guardrails in Large Language Models

    Authors: Seanie Lee, Dong Bok Lee, Dominik Wagner, Minki Kang, Haebin Seong, Tobias Bocklet, Juho Lee, Sung Ju Hwang

    Abstract: Deploying large language models (LLMs) in real-world applications requires robust safety guard models to detect and block harmful user prompts. While large safety guard models achieve strong performance, their computational cost is substantial. To mitigate this, smaller distilled models are used, but they often underperform on "hard" examples where the larger model provides accurate predictions. W… ▽ More

    Submitted 21 May, 2025; v1 submitted 17 February, 2025; originally announced February 2025.

    Comments: ACL 2025 findings

  6. arXiv:2410.01524  [pdf, other

    cs.CL cs.LG

    HarmAug: Effective Data Augmentation for Knowledge Distillation of Safety Guard Models

    Authors: Seanie Lee, Haebin Seong, Dong Bok Lee, Minki Kang, Xiaoyin Chen, Dominik Wagner, Yoshua Bengio, Juho Lee, Sung Ju Hwang

    Abstract: Safety guard models that detect malicious queries aimed at large language models (LLMs) are essential for ensuring the secure and responsible deployment of LLMs in real-world applications. However, deploying existing safety guard models with billions of parameters alongside LLMs on mobile devices is impractical due to substantial memory requirements and latency. To reduce this cost, we distill a l… ▽ More

    Submitted 24 February, 2025; v1 submitted 2 October, 2024; originally announced October 2024.

    Comments: ICLR 2025

  7. arXiv:2405.17918  [pdf, other

    cs.LG cs.AI

    Cost-Sensitive Multi-Fidelity Bayesian Optimization with Transfer of Learning Curve Extrapolation

    Authors: Dong Bok Lee, Aoxuan Silvia Zhang, Byungjoo Kim, Junhyeon Park, Juho Lee, Sung Ju Hwang, Hae Beom Lee

    Abstract: In this paper, we address the problem of cost-sensitive multi-fidelity Bayesian Optimization (BO) for efficient hyperparameter optimization (HPO). Specifically, we assume a scenario where users want to early-stop the BO when the performance improvement is not satisfactory with respect to the required computational cost. Motivated by this scenario, we introduce utility, which is a function predefin… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

  8. arXiv:2310.06511  [pdf, other

    cs.LG

    Self-Supervised Dataset Distillation for Transfer Learning

    Authors: Dong Bok Lee, Seanie Lee, Joonho Ko, Kenji Kawaguchi, Juho Lee, Sung Ju Hwang

    Abstract: Dataset distillation methods have achieved remarkable success in distilling a large dataset into a small set of representative samples. However, they are not designed to produce a distilled dataset that can be effectively used for facilitating self-supervised pre-training. To this end, we propose a novel problem of distilling an unlabeled dataset into a set of small synthetic samples for efficient… ▽ More

    Submitted 11 April, 2024; v1 submitted 10 October, 2023; originally announced October 2023.

  9. arXiv:2210.10485  [pdf, other

    cs.LG

    Learning Transferable Adversarial Robust Representations via Multi-view Consistency

    Authors: Minseon Kim, Hyeonjeong Ha, Dong Bok Lee, Sung Ju Hwang

    Abstract: Despite the success on few-shot learning problems, most meta-learned models only focus on achieving good performance on clean examples and thus easily break down when given adversarially perturbed samples. While some recent works have shown that a combination of adversarial learning and meta-learning could enhance the robustness of a meta-learner against adversarial attacks, they fail to achieve g… ▽ More

    Submitted 26 October, 2023; v1 submitted 19 October, 2022; originally announced October 2022.

    Comments: *Equal contribution (Author ordering determined by coin flip). NeurIPS SafetyML workshop 2022, Under review

  10. arXiv:2208.10494  [pdf, other

    cs.LG cs.AI

    Dataset Condensation with Latent Space Knowledge Factorization and Sharing

    Authors: Hae Beom Lee, Dong Bok Lee, Sung Ju Hwang

    Abstract: In this paper, we introduce a novel approach for systematically solving dataset condensation problem in an efficient manner by exploiting the regularity in a given dataset. Instead of condensing the dataset directly in the original input space, we assume a generative process of the dataset with a set of learnable codes defined in a compact latent space followed by a set of tiny decoders which maps… ▽ More

    Submitted 21 August, 2022; originally announced August 2022.

  11. arXiv:2106.03153  [pdf, other

    eess.AS cs.CL cs.LG cs.SD

    Meta-StyleSpeech : Multi-Speaker Adaptive Text-to-Speech Generation

    Authors: Dongchan Min, Dong Bok Lee, Eunho Yang, Sung Ju Hwang

    Abstract: With rapid progress in neural text-to-speech (TTS) models, personalized speech generation is now in high demand for many applications. For practical applicability, a TTS model should generate high-quality speech with only a few audio samples from the given speaker, that are also short in length. However, existing methods either require to fine-tune the model or achieve low adaptation quality witho… ▽ More

    Submitted 16 June, 2021; v1 submitted 6 June, 2021; originally announced June 2021.

    Comments: Accepted by ICML 2021

  12. arXiv:2012.07280  [pdf, other

    cs.CL

    Contrastive Learning with Adversarial Perturbations for Conditional Text Generation

    Authors: Seanie Lee, Dong Bok Lee, Sung Ju Hwang

    Abstract: Recently, sequence-to-sequence (seq2seq) models with the Transformer architecture have achieved remarkable performance on various conditional text generation tasks, such as machine translation. However, most of them are trained with teacher forcing with the ground truth label given at each time step, without being exposed to incorrectly generated tokens during training, which hurts its generalizat… ▽ More

    Submitted 10 March, 2021; v1 submitted 14 December, 2020; originally announced December 2020.

    Comments: ICLR 2021

  13. arXiv:2006.06648  [pdf, other

    cs.LG stat.ML

    Learning to Extrapolate Knowledge: Transductive Few-shot Out-of-Graph Link Prediction

    Authors: Jinheon Baek, Dong Bok Lee, Sung Ju Hwang

    Abstract: Many practical graph problems, such as knowledge graph construction and drug-drug interaction prediction, require to handle multi-relational graphs. However, handling real-world multi-relational graphs with Graph Neural Networks (GNNs) is often challenging due to their evolving nature, as new entities (nodes) can emerge over time. Moreover, newly emerged entities often have few links, which makes… ▽ More

    Submitted 29 October, 2020; v1 submitted 11 June, 2020; originally announced June 2020.

    Comments: NeurIPS 2020

  14. arXiv:2005.13837  [pdf, other

    cs.CL cs.LG

    Generating Diverse and Consistent QA pairs from Contexts with Information-Maximizing Hierarchical Conditional VAEs

    Authors: Dong Bok Lee, Seanie Lee, Woo Tae Jeong, Donghwan Kim, Sung Ju Hwang

    Abstract: One of the most crucial challenges in question answering (QA) is the scarcity of labeled data, since it is costly to obtain question-answer (QA) pairs for a target text domain with human annotation. An alternative approach to tackle the problem is to use automatically generated QA pairs from either the problem context or from large amount of unstructured texts (e.g. Wikipedia). In this work, we pr… ▽ More

    Submitted 14 June, 2020; v1 submitted 28 May, 2020; originally announced May 2020.

    Comments: ACL 2020