Skip to main content

Showing 1–27 of 27 results for author: Bak, Y

Searching in archive cs. Search in all archives.
.
  1. Explainable AI-Based Interface System for Weather Forecasting Model

    Authors: Soyeon Kim, Junho Choi, Yeji Choi, Subeen Lee, Artyom Stitsyuk, Minkyoung Park, Seongyeop Jeong, Youhyun Baek, Jaesik Choi

    Abstract: Machine learning (ML) is becoming increasingly popular in meteorological decision-making. Although the literature on explainable artificial intelligence (XAI) is growing steadily, user-centered XAI studies have not extend to this domain yet. This study defines three requirements for explanations of black-box models in meteorology through user studies: statistical model performance for different ra… ▽ More

    Submitted 1 April, 2025; originally announced April 2025.

    Comments: 19 pages, 16 figures

    MSC Class: 68T07 ACM Class: I.2.1

    Journal ref: HCI International 2023. Lecture Notes in Computer Science, vol 14059. Springer, Cham

  2. arXiv:2502.18934  [pdf, other

    cs.CL cs.LG

    Kanana: Compute-efficient Bilingual Language Models

    Authors: Kanana LLM Team, Yunju Bak, Hojin Lee, Minho Ryu, Jiyeon Ham, Seungjae Jung, Daniel Wontae Nam, Taegyeong Eo, Donghun Lee, Doohae Jung, Boseop Kim, Nayeon Kim, Jaesun Park, Hyunho Kim, Hyunwoong Ko, Changmin Lee, Kyoung-Woon On, Seulye Baeg, Junrae Cho, Sunghee Jung, Jieun Kang, EungGyun Kim, Eunhwa Kim, Byeongil Ko, Daniel Lee , et al. (4 additional authors not shown)

    Abstract: We introduce Kanana, a series of bilingual language models that demonstrate exceeding performance in Korean and competitive performance in English. The computational cost of Kanana is significantly lower than that of state-of-the-art models of similar size. The report details the techniques employed during pre-training to achieve compute-efficient yet competitive models, including high quality dat… ▽ More

    Submitted 28 February, 2025; v1 submitted 26 February, 2025; originally announced February 2025.

    Comments: 40 pages, 15 figures

  3. arXiv:2411.04351  [pdf, other

    cs.CV

    LidaRefer: Outdoor 3D Visual Grounding for Autonomous Driving with Transformers

    Authors: Yeong-Seung Baek, Heung-Seon Oh

    Abstract: 3D visual grounding (VG) aims to locate relevant objects or regions within 3D scenes based on natural language descriptions. Although recent methods for indoor 3D VG have successfully transformer-based architectures to capture global contextual information and enable fine-grained cross-modal fusion, they are unsuitable for outdoor environments due to differences in the distribution of point clouds… ▽ More

    Submitted 6 November, 2024; originally announced November 2024.

    Comments: 16 pages, 5 figures

  4. arXiv:2410.16345  [pdf, other

    cs.LG physics.data-an

    Exploring how deep learning decodes anomalous diffusion via Grad-CAM

    Authors: Jaeyong Bae, Yongjoo Baek, Hawoong Jeong

    Abstract: While deep learning has been successfully applied to the data-driven classification of anomalous diffusion mechanisms, how the algorithm achieves the feat still remains a mystery. In this study, we use a well-known technique aimed at achieving explainable AI, namely the Gradient-weighted Class Activation Map (Grad-CAM), to investigate how deep learning (implemented by ResNets) recognizes the disti… ▽ More

    Submitted 21 October, 2024; originally announced October 2024.

    Comments: 14 pages, 12 figures

  5. arXiv:2410.03181  [pdf, other

    cs.CL

    Kiss up, Kick down: Exploring Behavioral Changes in Multi-modal Large Language Models with Assigned Visual Personas

    Authors: Seungjong Sun, Eungu Lee, Seo Yeon Baek, Seunghyun Hwang, Wonbyung Lee, Dongyan Nan, Bernard J. Jansen, Jang Hyun Kim

    Abstract: This study is the first to explore whether multi-modal large language models (LLMs) can align their behaviors with visual personas, addressing a significant gap in the literature that predominantly focuses on text-based personas. We developed a novel dataset of 5K fictional avatar images for assignment as visual personas to LLMs, and analyzed their negotiation behaviors based on the visual traits… ▽ More

    Submitted 4 October, 2024; originally announced October 2024.

    Comments: EMNLP 2024

  6. arXiv:2406.16469  [pdf, other

    cs.CL cs.CV

    Evaluating Visual and Cultural Interpretation: The K-Viscuit Benchmark with Human-VLM Collaboration

    Authors: Yujin Baek, ChaeHun Park, Jaeseok Kim, Yu-Jung Heo, Du-Seong Chang, Jaegul Choo

    Abstract: To create culturally inclusive vision-language models (VLMs), developing a benchmark that tests their ability to address culturally relevant questions is essential. Existing approaches typically rely on human annotators, making the process labor-intensive and creating a cognitive burden in generating diverse questions. To address this, we propose a semi-automated framework for constructing cultura… ▽ More

    Submitted 17 December, 2024; v1 submitted 24 June, 2024; originally announced June 2024.

  7. arXiv:2405.00260  [pdf, other

    cs.CV

    CREPE: Coordinate-Aware End-to-End Document Parser

    Authors: Yamato Okamoto, Youngmin Baek, Geewook Kim, Ryota Nakao, DongHyun Kim, Moon Bin Yim, Seunghyun Park, Bado Lee

    Abstract: In this study, we formulate an OCR-free sequence generation model for visual document understanding (VDU). Our model not only parses text from document images but also extracts the spatial coordinates of the text based on the multi-head architecture. Named as Coordinate-aware End-to-end Document Parser (CREPE), our method uniquely integrates these capabilities by introducing a special token for OC… ▽ More

    Submitted 30 April, 2024; originally announced May 2024.

    Comments: Accepted at the International Conference on Document Analysis and Recognition (ICDAR 2024) main conference

  8. arXiv:2404.19427  [pdf

    cs.CV

    InstantFamily: Masked Attention for Zero-shot Multi-ID Image Generation

    Authors: Chanran Kim, Jeongin Lee, Shichang Joung, Bongmo Kim, Yeul-Min Baek

    Abstract: In the field of personalized image generation, the ability to create images preserving concepts has significantly improved. Creating an image that naturally integrates multiple concepts in a cohesive and visually appealing composition can indeed be challenging. This paper introduces "InstantFamily," an approach that employs a novel masked cross-attention mechanism and a multimodal embedding stack… ▽ More

    Submitted 30 April, 2024; originally announced April 2024.

  9. arXiv:2404.07857  [pdf, other

    physics.optics cs.ET nlin.CD

    Optical next generation reservoir computing

    Authors: Hao Wang, Jianqi Hu, YoonSeok Baek, Kohei Tsuchiyama, Malo Joly, Qiang Liu, Sylvain Gigan

    Abstract: Artificial neural networks with internal dynamics exhibit remarkable capability in processing information. Reservoir computing (RC) is a canonical example that features rich computing expressivity and compatibility with physical implementations for enhanced efficiency. Recently, a new RC paradigm known as next generation reservoir computing (NGRC) further improves expressivity but compromises its… ▽ More

    Submitted 23 October, 2024; v1 submitted 11 April, 2024; originally announced April 2024.

  10. arXiv:2404.05622  [pdf, other

    cs.CL cs.LG stat.ME

    How to Evaluate Entity Resolution Systems: An Entity-Centric Framework with Application to Inventor Name Disambiguation

    Authors: Olivier Binette, Youngsoo Baek, Siddharth Engineer, Christina Jones, Abel Dasylva, Jerome P. Reiter

    Abstract: Entity resolution (record linkage, microclustering) systems are notoriously difficult to evaluate. Looking for a needle in a haystack, traditional evaluation methods use sophisticated, application-specific sampling schemes to find matching pairs of records among an immense number of non-matches. We propose an alternative that facilitates the creation of representative, reusable benchmark data sets… ▽ More

    Submitted 8 April, 2024; originally announced April 2024.

    Comments: 33 pages, 11 figures

  11. arXiv:2306.12089  [pdf, other

    cs.CL

    Towards Accurate Translation via Semantically Appropriate Application of Lexical Constraints

    Authors: Yujin Baek, Koanho Lee, Dayeon Ki, Hyoung-Gyu Lee, Cheonbok Park, Jaegul Choo

    Abstract: Lexically-constrained NMT (LNMT) aims to incorporate user-provided terminology into translations. Despite its practical advantages, existing work has not evaluated LNMT models under challenging real-world conditions. In this paper, we focus on two important but under-studied issues that lie in the current evaluation process of LNMT studies. The model needs to cope with challenging lexical constrai… ▽ More

    Submitted 21 June, 2023; originally announced June 2023.

    Comments: Findings of ACL2023. 15 pages

  12. arXiv:2306.03783  [pdf, other

    stat.ML cs.LG

    Asymptotics of Bayesian Uncertainty Estimation in Random Features Regression

    Authors: Youngsoo Baek, Samuel I. Berchuck, Sayan Mukherjee

    Abstract: In this paper we compare and contrast the behavior of the posterior predictive distribution to the risk of the maximum a posteriori estimator for the random features regression model in the overparameterized regime. We will focus on the variance of the posterior predictive distribution (Bayesian model average) and compare its asymptotics to that of the risk of the MAP estimator. In the regime wher… ▽ More

    Submitted 26 October, 2023; v1 submitted 6 June, 2023; originally announced June 2023.

    Comments: 14 pages, 3 figures

  13. arXiv:2305.00630  [pdf, other

    cs.CV

    TRACE: Table Reconstruction Aligned to Corner and Edges

    Authors: Youngmin Baek, Daehyun Nam, Jaeheung Surh, Seung Shin, Seonghyeon Kim

    Abstract: A table is an object that captures structured and informative content within a document, and recognizing a table in an image is challenging due to the complexity and variety of table layouts. Many previous works typically adopt a two-stage approach; (1) Table detection(TD) localizes the table region in an image and (2) Table Structure Recognition(TSR) identifies row- and column-wise adjacency rela… ▽ More

    Submitted 30 April, 2023; originally announced May 2023.

    Comments: 18 pages, 7 figures, Accepted by ICDAR 2023

  14. arXiv:2303.05718  [pdf, other

    cond-mat.stat-mech cs.LG

    Tradeoff of generalization error in unsupervised learning

    Authors: Gilhan Kim, Hojun Lee, Junghyo Jo, Yongjoo Baek

    Abstract: Finding the optimal model complexity that minimizes the generalization error (GE) is a key issue of machine learning. For the conventional supervised learning, this task typically involves the bias-variance tradeoff: lowering the bias by making the model more complex entails an increase in the variance. Meanwhile, little has been studied about whether the same tradeoff exists for unsupervised lear… ▽ More

    Submitted 12 September, 2023; v1 submitted 10 March, 2023; originally announced March 2023.

    Comments: 15 pages, 7 figures

    Journal ref: J. Stat. Mech.: Theor. Exp. 2023, 083401 (2023)

  15. arXiv:2303.02901  [pdf, other

    cond-mat.stat-mech cs.LG stat.ML

    $α$-divergence Improves the Entropy Production Estimation via Machine Learning

    Authors: Euijoon Kwon, Yongjoo Baek

    Abstract: Recent years have seen a surge of interest in the algorithmic estimation of stochastic entropy production (EP) from trajectory data via machine learning. A crucial element of such algorithms is the identification of a loss function whose minimization guarantees the accurate EP estimation. In this study, we show that there exists a host of loss functions, namely those implementing a variational rep… ▽ More

    Submitted 19 January, 2024; v1 submitted 6 March, 2023; originally announced March 2023.

    Comments: 11 pages, 9 figures

  16. arXiv:2210.08103  [pdf, other

    eess.SP cs.AI

    High-resolution synthetic residential energy use profiles for the United States

    Authors: Swapna Thorve, Young Yun Baek, Samarth Swarup, Henning Mortveit, Achla Marathe, Anil Vullikanti, Madhav Marathe

    Abstract: Efficient energy consumption is crucial for achieving sustainable energy goals in the era of climate change and grid modernization. Thus, it is vital to understand how energy is consumed at finer resolutions such as household in order to plan demand-response events or analyze the impacts of weather, electricity prices, electric vehicles, solar, and occupancy schedules on energy consumption. Howeve… ▽ More

    Submitted 15 December, 2022; v1 submitted 14 October, 2022; originally announced October 2022.

    Comments: The paper has been accepted for publication in Nature Scientific Data

  17. arXiv:2210.01230  [pdf, other

    cs.DL cs.DB cs.LG stat.ME

    Estimating the Performance of Entity Resolution Algorithms: Lessons Learned Through PatentsView.org

    Authors: Olivier Binette, Sokhna A York, Emma Hickerson, Youngsoo Baek, Sarvo Madhavan, Christina Jones

    Abstract: This paper introduces a novel evaluation methodology for entity resolution algorithms. It is motivated by PatentsView.org, a U.S. Patents and Trademarks Office patent data exploration tool that disambiguates patent inventors using an entity resolution algorithm. We provide a data collection methodology and tailored performance estimators that account for sampling biases. Our approach is simple, pr… ▽ More

    Submitted 17 April, 2023; v1 submitted 3 October, 2022; originally announced October 2022.

    Comments: 20 pages, 4 figures

    Journal ref: The American Statistician (2023)

  18. arXiv:2208.07522  [pdf, other

    cs.LG cs.CY cs.SI

    Reliable Decision from Multiple Subtasks through Threshold Optimization: Content Moderation in the Wild

    Authors: Donghyun Son, Byounggyu Lew, Kwanghee Choi, Yongsu Baek, Seungwoo Choi, Beomjun Shin, Sungjoo Ha, Buru Chang

    Abstract: Social media platforms struggle to protect users from harmful content through content moderation. These platforms have recently leveraged machine learning models to cope with the vast amount of user-generated content daily. Since moderation policies vary depending on countries and types of products, it is common to train and deploy the models per policy. However, this approach is highly inefficien… ▽ More

    Submitted 25 January, 2023; v1 submitted 15 August, 2022; originally announced August 2022.

    Comments: WSDM2023 (Oral Presentation)

  19. arXiv:2205.08290  [pdf, other

    cs.SE

    Literature Review to Collect Conceptual Variables of Scenario Methods for Establishing a Conceptual Scenario Framework

    Authors: Young-Min Baek, Esther Cho, Donghwan Shin, Doo-Hwan Bae

    Abstract: Over recent decades, scenarios and scenario-based software/system engineering have been actively employed as essential tools to handle intricate problems, validate requirements, and support stakeholders' communication. However, despite the widespread use of scenarios, there have been several challenges for engineers to more willingly utilize scenario-based engineering approaches (i.e., scenario me… ▽ More

    Submitted 17 May, 2022; originally announced May 2022.

    Comments: 22 pages, 7 figures

    MSC Class: 68M99 ACM Class: D.2.1

  20. arXiv:2203.05122  [pdf, other

    cs.CV

    DEER: Detection-agnostic End-to-End Recognizer for Scene Text Spotting

    Authors: Seonghyeon Kim, Seung Shin, Yoonsik Kim, Han-Cheol Cho, Taeho Kil, Jaeheung Surh, Seunghyun Park, Bado Lee, Youngmin Baek

    Abstract: Recent end-to-end scene text spotters have achieved great improvement in recognizing arbitrary-shaped text instances. Common approaches for text spotting use region of interest pooling or segmentation masks to restrict features to single text instances. However, this makes it hard for the recognizer to decode correct sequences when the detection is not accurate i.e. one or more characters are crop… ▽ More

    Submitted 9 March, 2022; originally announced March 2022.

  21. arXiv:2007.09629  [pdf, other

    cs.CV

    Character Region Attention For Text Spotting

    Authors: Youngmin Baek, Seung Shin, Jeonghun Baek, Sungrae Park, Junyeop Lee, Daehyun Nam, Hwalsuk Lee

    Abstract: A scene text spotter is composed of text detection and recognition modules. Many studies have been conducted to unify these modules into an end-to-end trainable model to achieve better performance. A typical architecture places detection and recognition modules into separate branches, and a RoI pooling is commonly used to let the branches share a visual feature. However, there still exists a chanc… ▽ More

    Submitted 19 July, 2020; originally announced July 2020.

    Comments: 17 pages, 9 figures, Accepted by ECCV 2020

  22. arXiv:2006.06244  [pdf, other

    cs.CV

    CLEval: Character-Level Evaluation for Text Detection and Recognition Tasks

    Authors: Youngmin Baek, Daehyun Nam, Sungrae Park, Junyeop Lee, Seung Shin, Jeonghun Baek, Chae Young Lee, Hwalsuk Lee

    Abstract: Despite the recent success of text detection and recognition methods, existing evaluation metrics fail to provide a fair and reliable comparison among those methods. In addition, there exists no end-to-end evaluation metric that takes characteristics of OCR tasks into account. Previous end-to-end metric contains cascaded errors from the binary scoring process applied in both detection and recognit… ▽ More

    Submitted 11 June, 2020; originally announced June 2020.

    Comments: 12 pages, 8 figures

  23. arXiv:1907.01227  [pdf, other

    cs.CV

    TedEval: A Fair Evaluation Metric for Scene Text Detectors

    Authors: Chae Young Lee, Youngmin Baek, Hwalsuk Lee

    Abstract: Despite the recent success of scene text detection methods, common evaluation metrics fail to provide a fair and reliable comparison among detectors. They have obvious drawbacks in reflecting the inherent characteristic of text detection tasks, unable to address issues such as granularity, multiline, and character incompleteness. In this paper, we propose a novel evaluation protocol called TedEval… ▽ More

    Submitted 2 July, 2019; originally announced July 2019.

    Comments: 7 pages, 10 figures, Accepted by Workshop on Industrial Applications of Document Analysis and Recognition 2019

  24. arXiv:1906.03118  [pdf, other

    cs.LG stat.ML

    Reliable Estimation of Individual Treatment Effect with Causal Information Bottleneck

    Authors: Sungyub Kim, Yongsu Baek, Sung Ju Hwang, Eunho Yang

    Abstract: Estimating individual level treatment effects (ITE) from observational data is a challenging and important area in causal machine learning and is commonly considered in diverse mission-critical applications. In this paper, we propose an information theoretic approach in order to find more reliable representations for estimating ITE. We leverage the Information Bottleneck (IB) principle, which addr… ▽ More

    Submitted 7 June, 2019; originally announced June 2019.

  25. arXiv:1904.01941  [pdf, other

    cs.CV

    Character Region Awareness for Text Detection

    Authors: Youngmin Baek, Bado Lee, Dongyoon Han, Sangdoo Yun, Hwalsuk Lee

    Abstract: Scene text detection methods based on neural networks have emerged recently and have shown promising results. Previous methods trained with rigid word-level bounding boxes exhibit limitations in representing the text region in an arbitrary shape. In this paper, we propose a new scene text detection method to effectively detect text area by exploring each character and affinity between characters.… ▽ More

    Submitted 3 April, 2019; originally announced April 2019.

    Comments: 12 pages, 11 figures, Accepted by CVPR 2019

  26. Cultural Values and Cross-cultural Video Consumption on YouTube

    Authors: Minsu Park, Jaram Park, Young Min Baek, Michael Macy

    Abstract: Video-sharing social media like YouTube provide access to diverse cultural products from all over the world, making it possible to test theories that the Web facilitates global cultural convergence. Drawing on a daily listing of YouTube's most popular videos across 58 countries, we investigate the consumption of popular videos in countries that differ in cultural values, language, gross domestic p… ▽ More

    Submitted 17 May, 2017; v1 submitted 8 May, 2017; originally announced May 2017.

    Journal ref: PLoS ONE 12(5): e0177865 (2017)

  27. arXiv:1207.0349  [pdf, ps, other

    cond-mat.stat-mech cs.CR physics.soc-ph

    Fundamental Structural Constraint of Random Scale-Free Networks

    Authors: Yongjoo Baek, Daniel Kim, Meesoon Ha, Hawoong Jeong

    Abstract: We study the structural constraint of random scale-free networks that determines possible combinations of the degree exponent $γ$ and the upper cutoff $k_c$ in the thermodynamic limit. We employ the framework of graphicality transitions proposed by [Del Genio and co-workers, Phys. Rev. Lett. {\bf 107}, 178701 (2011)], while making it more rigorous and applicable to general values of kc. Using the… ▽ More

    Submitted 11 September, 2012; v1 submitted 2 July, 2012; originally announced July 2012.

    Comments: 5 pages, 4 figures (7 eps files), 2 tables; published version

    Journal ref: Phys. Rev. Lett. 109, 118701 (2012)