Skip to main content

Showing 1–28 of 28 results for author: Cho, Y J

Searching in archive cs. Search in all archives.
.
  1. arXiv:2411.15783  [pdf, other

    cs.HC

    Limitations of Online Play Content for Parents of Infants and Toddlers

    Authors: Keunwoo Park, Subin Ahn, Mina Jung, You Jung Cho, Seulah Jeong, Cheong-Ah Huh

    Abstract: Play is a fundamental aspect of developmental growth, yet many parents encounter significant challenges in fulfilling their caregiving roles in this area. As online content increasingly serves as the primary source of parental guidance, this study investigates the difficulties parents face related to play and evaluates the limitations of current online content. We identified ten findings through i… ▽ More

    Submitted 4 January, 2025; v1 submitted 24 November, 2024; originally announced November 2024.

    Comments: Accepted to HCI Korea 2025

  2. arXiv:2406.01506  [pdf, other

    cs.CL cs.AI cs.LG stat.ML

    The Geometry of Categorical and Hierarchical Concepts in Large Language Models

    Authors: Kiho Park, Yo Joong Choe, Yibo Jiang, Victor Veitch

    Abstract: The linear representation hypothesis is the informal idea that semantic concepts are encoded as linear directions in the representation spaces of large language models (LLMs). Previous work has shown how to make this notion precise for representing binary concepts that have natural contrasts (e.g., {male, female}) as directions in representation space. However, many natural concepts do not have na… ▽ More

    Submitted 17 February, 2025; v1 submitted 3 June, 2024; originally announced June 2024.

    Comments: Accepted for an oral presentation at ICLR 2025. Best Paper Award at the ICML 2024 Workshop on Mechanistic Interpretability. Code is available at https://github.com/KihoPark/LLM_Categorical_Hierarchical_Representations

  3. arXiv:2402.09698  [pdf, other

    stat.ME cs.LG math.PR math.ST stat.ML

    Combining Evidence Across Filtrations

    Authors: Yo Joong Choe, Aaditya Ramdas

    Abstract: In sequential anytime-valid inference, any admissible procedure must be based on e-processes: generalizations of test martingales that quantify the accumulated evidence against a composite null hypothesis at any stopping time. This paper proposes a method for combining e-processes constructed in different filtrations but for the same null. Although e-processes in the same filtration can be combine… ▽ More

    Submitted 15 February, 2025; v1 submitted 14 February, 2024; originally announced February 2024.

    Comments: Under review. Previous title was "Combining Evidence Across Filtrations Using Adjusters". Code is available at https://github.com/yjchoe/CombiningEvidenceAcrossFiltrations

  4. arXiv:2401.06432  [pdf, other

    cs.LG cs.DC

    Heterogeneous LoRA for Federated Fine-tuning of On-Device Foundation Models

    Authors: Yae Jee Cho, Luyang Liu, Zheng Xu, Aldi Fahrezi, Gauri Joshi

    Abstract: Foundation models (FMs) adapt well to specific domains or tasks with fine-tuning, and federated learning (FL) enables the potential for privacy-preserving fine-tuning of the FMs with on-device local data. For federated fine-tuning of FMs, we consider the FMs with small to medium parameter sizes of single digit billion at maximum, referred to as on-device FMs (ODFMs) that can be deployed on devices… ▽ More

    Submitted 20 February, 2024; v1 submitted 12 January, 2024; originally announced January 2024.

  5. arXiv:2311.03658  [pdf, other

    cs.CL cs.AI cs.LG stat.ML

    The Linear Representation Hypothesis and the Geometry of Large Language Models

    Authors: Kiho Park, Yo Joong Choe, Victor Veitch

    Abstract: Informally, the 'linear representation hypothesis' is the idea that high-level concepts are represented linearly as directions in some representation space. In this paper, we address two closely related questions: What does "linear representation" actually mean? And, how do we make sense of geometric notions (e.g., cosine similarity or projection) in the representation space? To answer these, we u… ▽ More

    Submitted 17 July, 2024; v1 submitted 6 November, 2023; originally announced November 2023.

    Comments: Accepted for a presentation at ICML 2024 and an oral presentation at NeurIPS 2023 Workshop on Causal Representation Learning. Code is available at https://github.com/KihoPark/linear_rep_geometry

  6. arXiv:2307.08809  [pdf, other

    cs.LG cs.AI cs.CV

    Local or Global: Selective Knowledge Assimilation for Federated Learning with Limited Labels

    Authors: Yae Jee Cho, Gauri Joshi, Dimitrios Dimitriadis

    Abstract: Many existing FL methods assume clients with fully-labeled data, while in realistic settings, clients have limited labels due to the expensive and laborious process of labeling. Limited labeled local data of the clients often leads to their local model having poor generalization abilities to their larger unlabeled local data, such as having class-distribution mismatch with the unlabeled data. As a… ▽ More

    Submitted 17 July, 2023; originally announced July 2023.

    Comments: To appear in the proceedings of ICCV 2023

  7. arXiv:2305.10564  [pdf, other

    stat.ML cs.AI cs.LG stat.ME

    Counterfactually Comparing Abstaining Classifiers

    Authors: Yo Joong Choe, Aditya Gangrade, Aaditya Ramdas

    Abstract: Abstaining classifiers have the option to abstain from making predictions on inputs that they are unsure about. These classifiers are becoming increasingly popular in high-stakes decision-making problems, as they can withhold uncertain predictions to improve their reliability and safety. When evaluating black-box abstaining classifier(s), however, we lack a principled approach that accounts for wh… ▽ More

    Submitted 9 November, 2023; v1 submitted 17 May, 2023; originally announced May 2023.

    Comments: Accepted to NeurIPS 2023. Preliminary work presented at the ICML 2023 Workshop on Counterfactuals in Minds and Machines. Code available at https://github.com/yjchoe/ComparingAbstainingClassifiers

  8. arXiv:2302.03109  [pdf, other

    cs.LG cs.DC

    On the Convergence of Federated Averaging with Cyclic Client Participation

    Authors: Yae Jee Cho, Pranay Sharma, Gauri Joshi, Zheng Xu, Satyen Kale, Tong Zhang

    Abstract: Federated Averaging (FedAvg) and its variants are the most popular optimization algorithms in federated learning (FL). Previous convergence analyses of FedAvg either assume full client participation or partial client participation where the clients can be uniformly sampled. However, in practical cross-device FL systems, only a subset of clients that satisfy local criteria such as battery status, n… ▽ More

    Submitted 6 February, 2023; originally announced February 2023.

  9. arXiv:2205.14840  [pdf, other

    cs.LG

    Maximizing Global Model Appeal in Federated Learning

    Authors: Yae Jee Cho, Divyansh Jhunjhunwala, Tian Li, Virginia Smith, Gauri Joshi

    Abstract: Federated learning typically considers collaboratively training a global model using local data at edge clients. Clients may have their own individual requirements, such as having a minimal training loss threshold, which they expect to be met by the global model. However, due to client heterogeneity, the global model may not meet each client's requirements, and only a small subset may find the glo… ▽ More

    Submitted 4 February, 2023; v1 submitted 30 May, 2022; originally announced May 2022.

  10. arXiv:2204.12703  [pdf, other

    cs.LG

    Heterogeneous Ensemble Knowledge Transfer for Training Large Models in Federated Learning

    Authors: Yae Jee Cho, Andre Manoel, Gauri Joshi, Robert Sim, Dimitrios Dimitriadis

    Abstract: Federated learning (FL) enables edge-devices to collaboratively learn a model without disclosing their private data to a central aggregating server. Most existing FL algorithms require models of identical architecture to be deployed across the clients and server, making it infeasible to train large models due to clients' limited system resources. In this work, we propose a novel ensemble knowledge… ▽ More

    Submitted 27 April, 2022; originally announced April 2022.

    Comments: To appear in the proceedings of the 31st International Joint Conference on Artificial Intelligence (IJCAI 2022)

  11. arXiv:2110.00115  [pdf, other

    stat.ME cs.LG math.ST stat.AP stat.ML

    Comparing Sequential Forecasters

    Authors: Yo Joong Choe, Aaditya Ramdas

    Abstract: Consider two forecasters, each making a single prediction for a sequence of events over time. We ask a relatively basic question: how might we compare these forecasters, either online or post-hoc, while avoiding unverifiable assumptions on how the forecasts and outcomes were generated? In this paper, we present a rigorous answer to this question by designing novel sequential inference procedures f… ▽ More

    Submitted 9 November, 2023; v1 submitted 30 September, 2021; originally announced October 2021.

    Comments: Published in Operations Research. Code and data sources available at https://github.com/yjchoe/ComparingForecasters

  12. arXiv:2109.08119  [pdf, other

    cs.LG

    Personalized Federated Learning for Heterogeneous Clients with Clustered Knowledge Transfer

    Authors: Yae Jee Cho, Jianyu Wang, Tarun Chiruvolu, Gauri Joshi

    Abstract: Personalized federated learning (FL) aims to train model(s) that can perform well for individual clients that are highly data and system heterogeneous. Most work in personalized FL, however, assumes using the same model architecture at all clients and increases the communication cost by sending/receiving models. This may not be feasible for realistic scenarios of FL. In practice, clients have high… ▽ More

    Submitted 16 September, 2021; originally announced September 2021.

  13. arXiv:2012.08009  [pdf, other

    cs.LG cs.AI

    Bandit-based Communication-Efficient Client Selection Strategies for Federated Learning

    Authors: Yae Jee Cho, Samarth Gupta, Gauri Joshi, Osman YaÄŸan

    Abstract: Due to communication constraints and intermittent client availability in federated learning, only a subset of clients can participate in each training round. While most prior works assume uniform and unbiased client selection, recent work on biased client selection has shown that selecting clients with higher local losses can improve error convergence speed. However, previously proposed biased sel… ▽ More

    Submitted 14 December, 2020; originally announced December 2020.

  14. arXiv:2010.01243  [pdf, other

    cs.LG cs.DC stat.ML

    Client Selection in Federated Learning: Convergence Analysis and Power-of-Choice Selection Strategies

    Authors: Yae Jee Cho, Jianyu Wang, Gauri Joshi

    Abstract: Federated learning is a distributed optimization paradigm that enables a large number of resource-limited client nodes to cooperatively train a model without data sharing. Several works have analyzed the convergence of federated learning by accounting of data heterogeneity, communication and computation limitations, and partial client participation. However, they assume unbiased client participati… ▽ More

    Submitted 2 October, 2020; originally announced October 2020.

  15. arXiv:2004.05007  [pdf, other

    stat.ML cs.LG

    An Empirical Study of Invariant Risk Minimization

    Authors: Yo Joong Choe, Jiyeon Ham, Kyubyong Park

    Abstract: Invariant risk minimization (IRM) (Arjovsky et al., 2019) is a recently proposed framework designed for learning predictors that are invariant to spurious correlations across different training environments. Yet, despite its theoretical justifications, IRM has not been extensively tested across various settings. In an attempt to gain a better understanding of the framework, we empirically investig… ▽ More

    Submitted 6 July, 2020; v1 submitted 10 April, 2020; originally announced April 2020.

    Comments: Presented at the ICML 2020 Workshop on Uncertainty and Robustness in Deep Learning. Code at https://github.com/kakaobrain/irm-empirical-study

  16. arXiv:2004.03289  [pdf, other

    cs.CL

    KorNLI and KorSTS: New Benchmark Datasets for Korean Natural Language Understanding

    Authors: Jiyeon Ham, Yo Joong Choe, Kyubyong Park, Ilji Choi, Hyungjoon Soh

    Abstract: Natural language inference (NLI) and semantic textual similarity (STS) are key tasks in natural language understanding (NLU). Although several benchmark datasets for those tasks have been released in English and a few other languages, there are no publicly available NLI or STS datasets in the Korean language. Motivated by this, we construct and release new datasets for Korean NLI and STS, dubbed K… ▽ More

    Submitted 5 October, 2020; v1 submitted 7 April, 2020; originally announced April 2020.

    Comments: Findings of EMNLP 2020. Datasets available at https://github.com/kakaobrain/KorNLUDatasets

  17. arXiv:1911.12071  [pdf, other

    cs.CL

    Jejueo Datasets for Machine Translation and Speech Synthesis

    Authors: Kyubyong Park, Yo Joong Choe, Jiyeon Ham

    Abstract: Jejueo was classified as critically endangered by UNESCO in 2010. Although diverse efforts to revitalize it have been made, there have been few computational approaches. Motivated by this, we construct two new Jejueo datasets: Jejueo Interview Transcripts (JIT) and Jejueo Single Speaker Speech (JSS). The JIT dataset is a parallel corpus containing 170k+ Jejueo-Korean sentences, and the JSS dataset… ▽ More

    Submitted 27 November, 2019; originally announced November 2019.

  18. arXiv:1911.12019  [pdf, other

    cs.CL

    word2word: A Collection of Bilingual Lexicons for 3,564 Language Pairs

    Authors: Yo Joong Choe, Kyubyong Park, Dongwoo Kim

    Abstract: We present word2word, a publicly available dataset and an open-source Python package for cross-lingual word translations extracted from sentence-level parallel corpora. Our dataset provides top-k word translations in 3,564 (directed) language pairs across 62 languages in OpenSubtitles2018 (Lison et al., 2018). To obtain this dataset, we use a count-based bilingual lexicon extraction model based on… ▽ More

    Submitted 27 November, 2019; originally announced November 2019.

  19. arXiv:1907.01256  [pdf, other

    cs.CL cs.LG

    A Neural Grammatical Error Correction System Built On Better Pre-training and Sequential Transfer Learning

    Authors: Yo Joong Choe, Jiyeon Ham, Kyubyong Park, Yeoil Yoon

    Abstract: Grammatical error correction can be viewed as a low-resource sequence-to-sequence task, because publicly available parallel corpora are limited. To tackle this challenge, we first generate erroneous versions of large unannotated corpora using a realistic noising function. The resulting parallel corpora are subsequently used to pre-train Transformer models. Then, by sequentially applying transfer l… ▽ More

    Submitted 2 July, 2019; originally announced July 2019.

    Comments: Accepted to ACL 2019 Workshop on Innovative Use of NLP for Building Educational Applications (BEA)

  20. arXiv:1904.08144  [pdf, other

    cs.LG stat.ML

    Predicting drug-target interaction using 3D structure-embedded graph representations from graph neural networks

    Authors: Jaechang Lim, Seongok Ryu, Kyubyong Park, Yo Joong Choe, Jiyeon Ham, Woo Youn Kim

    Abstract: Accurate prediction of drug-target interaction (DTI) is essential for in silico drug design. For the purpose, we propose a novel approach for predicting DTI using a GNN that directly incorporates the 3D structure of a protein-ligand complex. We also apply a distance-aware graph attention algorithm with gate augmentation to increase the performance of our model. As a result, our model shows better… ▽ More

    Submitted 17 April, 2019; originally announced April 2019.

    Comments: 20 pages, 2 figures

  21. arXiv:1902.07249  [pdf, other

    cs.CL cs.LG stat.ML

    Discovery of Natural Language Concepts in Individual Units of CNNs

    Authors: Seil Na, Yo Joong Choe, Dong-Hyun Lee, Gunhee Kim

    Abstract: Although deep convolutional networks have achieved improved performance in many natural language tasks, they have been treated as black boxes because they are difficult to interpret. Especially, little is known about how they represent language in their intermediate layers. In an attempt to understand the representations of deep convolutional networks trained on language tasks, we show that indivi… ▽ More

    Submitted 28 February, 2019; v1 submitted 18 February, 2019; originally announced February 2019.

    Comments: Published as a conference paper at ICLR 2019

  22. arXiv:1805.09252  [pdf, other

    cs.NI

    V2X Downlink Coverage Analysis with a Realistic Urban Vehicular Model

    Authors: Yae Jee Cho, Kaibin Huang, Chan-Byoung Chae

    Abstract: As the realization of vehicular communication such as vehicle-to-vehicle (V2V) or vehicle-to-infrastructure (V2I) is imperative for the autonomous driving cars, the understanding of realistic vehicle-to-everything (V2X) models is needed. While previous research has mostly targeted vehicular models in which vehicles are randomly distributed and the variable of carrier frequency was not considered,… ▽ More

    Submitted 25 June, 2018; v1 submitted 10 May, 2018; originally announced May 2018.

  23. arXiv:1801.07167  [pdf, ps, other

    eess.SP cs.IT

    RF Lens-Embedded Antenna Array for mmWave MIMO: Design and Performance

    Authors: Yae Jee Cho, Gee-Yong Suk, Byoungnam Kim, Dong Ku Kim, Chan-Byoung Chae

    Abstract: The requirement of high data-rate in the fifth generation wireless systems (5G) calls for the ultimate utilization of the wide bandwidth in the mmWave frequency band. Researchers seeking to compensate for mmWave's high path loss and to achieve both gain and directivity have proposed that mmWave multiple-input multiple-output (MIMO) systems make use of beamforming systems. Hybrid beamforming in mmW… ▽ More

    Submitted 22 January, 2018; originally announced January 2018.

  24. arXiv:1711.09052  [pdf, other

    cs.IT eess.SP

    Map-based Millimeter-Wave Channel Models: An Overview, Hybrid Modeling, Data, and Learning

    Authors: Yeon-Geun Lim, Yae Jee Cho, MinSoo Sim, Younsun Kim, Chan-Byoung Chae, Reinaldo A. Valenzuela

    Abstract: Compared to the current wireless communication systems, millimeter wave (mm-Wave) promises a wide range of spectrum. As viable alternatives to existing mm-Wave channel models, various map-based channel models with different modeling methods have been widely discussed. Map-based channel models are based on a ray-tracing algorithm and include realistic channel parameters in a given map. Such paramet… ▽ More

    Submitted 10 July, 2019; v1 submitted 24 November, 2017; originally announced November 2017.

  25. arXiv:1707.00227  [pdf, other

    cs.IT

    Relationship between Cross-Polarization Discrimination (XPD) and Spatial Correlation in Indoor Small-Cell MIMO Systems

    Authors: Yeon-Geun Lim, Yae Jee Cho, TaeckKeun Oh, Yongshik Lee, Chan-Byoung Chae

    Abstract: In this letter, we present a correlated channel model for a dual-polarization antenna to omnidirectional antennas in indoor small-cell multiple-input multiple-output (MIMO) systems. In an indoor environment, we confirm that the cross-polarization discrimination (XPD) in the direction of angle-of-departure can be represented as the spatial correlation of the MIMO channel. We also evaluate a dual-po… ▽ More

    Submitted 6 December, 2017; v1 submitted 1 July, 2017; originally announced July 2017.

  26. arXiv:1703.06384  [pdf, other

    cs.ET

    Effective Enzyme Deployment for Degradation of Interference Molecules in Molecular Communication

    Authors: Yae Jee Cho, H. Birkan Yilmaz, Weisi Guo, Chan-Byoung Chae

    Abstract: In molecular communication, the heavy tail nature of molecular signals causes inter-symbol interference (ISI). Because of this, it is difficult to decrease symbol periods and achieve high data rate. As a probable solution for ISI mitigation, enzymes were proposed to be used since they are capable of degrading ISI molecules without deteriorating the molecular communication. While most prior work ha… ▽ More

    Submitted 18 March, 2017; originally announced March 2017.

  27. arXiv:1611.06079  [pdf, other

    cs.ET cs.IT

    A Machine Learning Approach to Model the Received Signal in Molecular Communications

    Authors: H. Birkan Yilmaz, Changmin Lee, Yae Jee Cho, Chan-Byoung Chae

    Abstract: A molecular communication channel is determined by the received signal. Received signal models form the basis for studies focused on modulation, receiver design, capacity, and coding depend on the received signal models. Therefore, it is crucial to model the number of received molecules until time $t$ analytically. Modeling the diffusion-based molecular communication channel with the first-hitting… ▽ More

    Submitted 18 November, 2016; originally announced November 2016.

  28. arXiv:1604.05018  [pdf, other

    cs.ET

    Effective inter-symbol interference mitigation with a limited amount of enzymes in molecular communications

    Authors: Yae Jee Cho, H. Birkan Yilmaz, Weisi Guo, Chan-Byoung Chae

    Abstract: In molecular communication via diffusion (MCvD), the inter-symbol interference (ISI) is a well known severe problem that deteriorates both data rates and link reliability. ISI mainly occurs due to the slow and highly random propagation of the messenger molecules, which causes the emitted molecules from the previous symbols to interfere with molecules from the current symbol. An effective way to mi… ▽ More

    Submitted 18 April, 2016; originally announced April 2016.