Skip to main content

Showing 1–50 of 253 results for author: Naeem, H

.
  1. arXiv:2506.07460  [pdf, ps, other

    cs.CV cs.CL

    GLOS: Sign Language Generation with Temporally Aligned Gloss-Level Conditioning

    Authors: Taeryung Lee, Hyeongjin Nam, Gyeongsik Moon, Kyoung Mu Lee

    Abstract: Sign language generation (SLG), or text-to-sign generation, bridges the gap between signers and non-signers. Despite recent progress in SLG, existing methods still often suffer from incorrect lexical ordering and low semantic accuracy. This is primarily due to sentence-level condition, which encodes the entire sentence of the input text into a single feature vector as a condition for SLG. This app… ▽ More

    Submitted 9 June, 2025; originally announced June 2025.

  2. arXiv:2505.13235  [pdf, ps, other

    cs.CV cs.LG

    WriteViT: Handwritten Text Generation with Vision Transformer

    Authors: Dang Hoai Nam, Huynh Tong Dang Khoa, Vo Nguyen Le Duy

    Abstract: Humans can quickly generalize handwriting styles from a single example by intuitively separating content from style. Machines, however, struggle with this task, especially in low-data settings, often missing subtle spatial and stylistic cues. Motivated by this gap, we introduce WriteViT, a one-shot handwritten text synthesis framework that incorporates Vision Transformers (ViT), a family of models… ▽ More

    Submitted 19 May, 2025; originally announced May 2025.

  3. arXiv:2505.11855  [pdf, ps, other

    cs.CL

    When AI Co-Scientists Fail: SPOT-a Benchmark for Automated Verification of Scientific Research

    Authors: Guijin Son, Jiwoo Hong, Honglu Fan, Heejeong Nam, Hyunwoo Ko, Seungwon Lim, Jinyeop Song, Jinha Choi, Gonçalo Paulo, Youngjae Yu, Stella Biderman

    Abstract: Recent advances in large language models (LLMs) have fueled the vision of automated scientific discovery, often called AI Co-Scientists. To date, prior work casts these systems as generative co-authors responsible for crafting hypotheses, synthesizing code, or drafting manuscripts. In this work, we explore a complementary application: using LLMs as verifiers to automate the \textbf{academic verifi… ▽ More

    Submitted 17 May, 2025; originally announced May 2025.

    Comments: work in progress

  4. arXiv:2504.14817  [pdf

    eess.AS

    DNN based HRIRs Identification with a Continuously Rotating Speaker Array

    Authors: Byeong-Yun Ko, Deokki Min, Hyeonuk Nam, Yong-Hwa Park

    Abstract: Conventional static measurement of head-related impulse responses (HRIRs) is time-consuming due to the need for repositioning a speaker array for each azimuth angle. Dynamic approaches using analytical models with a continuously rotating speaker array have been proposed, but their accuracy is significantly reduced at high rotational speeds. To address this limitation, we propose a DNN-based HRIRs… ▽ More

    Submitted 20 April, 2025; originally announced April 2025.

  5. arXiv:2504.12670  [pdf, other

    eess.AS cs.SD

    Temporal Attention Pooling for Frequency Dynamic Convolution in Sound Event Detection

    Authors: Hyeonuk Nam, Yong-Hwa Park

    Abstract: Recent advances in deep learning, particularly frequency dynamic convolution (FDY conv), have significantly improved sound event detection (SED) by enabling frequency-adaptive feature extraction. However, FDY conv relies on temporal average pooling, which treats all temporal frames equally, limiting its ability to capture transient sound events such as alarm bells, door knocks, and speech plosives… ▽ More

    Submitted 17 April, 2025; originally announced April 2025.

  6. arXiv:2504.00600  [pdf, other

    hep-th gr-qc hep-ph

    De Sitter vacuum on nucleated D-branes with stringy corrections

    Authors: Cao H. Nam

    Abstract: We find four-dimensional de Sitter (dS) vacuum solutions on probe D-branes which nucleate in an asymptotically $\text{AdS}_5\times T^{1,1}$ background, including stringy corrections. A sufficiently high chemical potential induced by the wrapped D3-brane charge, breaking supersymmetry in the bulk, is essential to lead to the nucleation of the probe D-brane. We show that stringy corrections can yiel… ▽ More

    Submitted 1 April, 2025; originally announced April 2025.

    Comments: 20 pages, 1 figure

  7. arXiv:2503.19373  [pdf, other

    cs.CV cs.AI

    DeClotH: Decomposable 3D Cloth and Human Body Reconstruction from a Single Image

    Authors: Hyeongjin Nam, Donghwan Kim, Jeongtaek Oh, Kyoung Mu Lee

    Abstract: Most existing methods of 3D clothed human reconstruction from a single image treat the clothed human as a single object without distinguishing between cloth and human body. In this regard, we present DeClotH, which separately reconstructs 3D cloth and human body from a single image. This task remains largely unexplored due to the extreme occlusion between cloth and the human body, making it challe… ▽ More

    Submitted 25 March, 2025; originally announced March 2025.

    Comments: Published at CVPR 2025, 17 pages including the supplementary material

  8. arXiv:2503.15879  [pdf, other

    cs.CL cs.IR

    Typed-RAG: Type-aware Multi-Aspect Decomposition for Non-Factoid Question Answering

    Authors: DongGeon Lee, Ahjeong Park, Hyeri Lee, Hyeonseo Nam, Yunho Maeng

    Abstract: Non-factoid question-answering (NFQA) poses a significant challenge due to its open-ended nature, diverse intents, and the need for multi-aspect reasoning, which renders conventional factoid QA approaches, including retrieval-augmented generation (RAG), inadequate. Unlike factoid questions, non-factoid questions (NFQs) lack definitive answers and require synthesizing information from multiple sour… ▽ More

    Submitted 21 March, 2025; v1 submitted 20 March, 2025; originally announced March 2025.

    Comments: Accepted to NAACL 2025 SRW

  9. arXiv:2503.15855  [pdf, other

    cs.CV cs.AI

    VideoRFSplat: Direct Scene-Level Text-to-3D Gaussian Splatting Generation with Flexible Pose and Multi-View Joint Modeling

    Authors: Hyojun Go, Byeongjun Park, Hyelin Nam, Byung-Hoon Kim, Hyungjin Chung, Changick Kim

    Abstract: We propose VideoRFSplat, a direct text-to-3D model leveraging a video generation model to generate realistic 3D Gaussian Splatting (3DGS) for unbounded real-world scenes. To generate diverse camera poses and unbounded spatial extent of real-world scenes, while ensuring generalization to arbitrary text prompts, previous methods fine-tune 2D generative models to jointly model camera poses and multi-… ▽ More

    Submitted 20 March, 2025; originally announced March 2025.

    Comments: Project page: https://gohyojun15.github.io/VideoRFSplat/

  10. arXiv:2503.12024  [pdf, other

    cs.CV

    SteerX: Creating Any Camera-Free 3D and 4D Scenes with Geometric Steering

    Authors: Byeongjun Park, Hyojun Go, Hyelin Nam, Byung-Hoon Kim, Hyungjin Chung, Changick Kim

    Abstract: Recent progress in 3D/4D scene generation emphasizes the importance of physical alignment throughout video generation and scene reconstruction. However, existing methods improve the alignment separately at each stage, making it difficult to manage subtle misalignments arising from another stage. Here, we present SteerX, a zero-shot inference-time steering method that unifies scene reconstruction i… ▽ More

    Submitted 15 March, 2025; originally announced March 2025.

    Comments: Project page: https://byeongjun-park.github.io/SteerX/

  11. arXiv:2503.11020  [pdf, other

    cs.RO cs.CV

    Fast and Robust Localization for Humanoid Soccer Robot via Iterative Landmark Matching

    Authors: Ruochen Hou, Mingzhang Zhu, Hyunwoo Nam, Gabriel I. Fernandez, Dennis W. Hong

    Abstract: Accurate robot localization is essential for effective operation. Monte Carlo Localization (MCL) is commonly used with known maps but is computationally expensive due to landmark matching for each particle. Humanoid robots face additional challenges, including sensor noise from locomotion vibrations and a limited field of view (FOV) due to camera placement. This paper proposes a fast and robust lo… ▽ More

    Submitted 16 May, 2025; v1 submitted 13 March, 2025; originally announced March 2025.

  12. arXiv:2503.02528  [pdf, ps, other

    hep-ex nucl-ex

    Prospects for Pentaquark Baryon Search with the Upgraded LEPS2 Facility

    Authors: T. Nakano, S. Ajimura, Y. Asano, S. Dat'e, T. Hashimoto, A. Higashi, T. Hotta, T. Ishikawa, H. Katsuragawa, R. Kobayakawa, H. Kohri, K. Mizutani, Y. Ohashi, H. Ohkuma, S. Y. Ryu, S. Suzuki, S. Tanaka, K. Watanabe, B. Yan, T. Yorita, M. Yosoi, G. Kojima, M. Miyabe, N. Muramatsu, H. Ohnishi , et al. (11 additional authors not shown)

    Abstract: We present prospects for the $Θ^+$ pentaquark baryon search using the newly constructed LEPS2 facility at SPring-8. The LEPS2 detector system features significant improvements in acceptance for multi-particle final states compared to previous experiments. Our search employs two complementary strategies: direct production in the $γn \to K^-Θ^+$ reaction using a liquid deuterium target with a photon… ▽ More

    Submitted 4 March, 2025; originally announced March 2025.

    Comments: to be published on Acta Physica Polonica B 56 (2025)

    Report number: RCNP-Ex25001

  13. arXiv:2502.20857  [pdf, other

    eess.AS cs.SD

    JiTTER: Jigsaw Temporal Transformer for Event Reconstruction for Self-Supervised Sound Event Detection

    Authors: Hyeonuk Nam, Yong-Hwa Park

    Abstract: Sound event detection (SED) has significantly benefited from self-supervised learning (SSL) approaches, particularly masked audio transformer for SED (MAT-SED), which leverages masked block prediction to reconstruct missing audio segments. However, while effective in capturing global dependencies, masked block prediction disrupts transient sound events and lacks explicit enforcement of temporal or… ▽ More

    Submitted 28 February, 2025; originally announced February 2025.

  14. arXiv:2502.07208  [pdf

    eess.AS cs.SD

    Towards Understanding of Frequency Dependence on Sound Event Detection

    Authors: Hyeonuk Nam, Seong-Hu Kim, Deokki Min, Byeong-Yun Ko, Yong-Hwa Park

    Abstract: In this work, various analysis methods are conducted on frequency-dependent methods on SED to further delve into their detailed characteristics and behaviors on SED. While SED has been rapidly advancing through the adoption of various deep learning techniques from other pattern recognition fields, these techniques are often not suitable for SED. To address this issue, two frequency-dependent SED m… ▽ More

    Submitted 10 February, 2025; originally announced February 2025.

  15. arXiv:2412.20638  [pdf, other

    cs.AI cs.LG

    Predicting Long Term Sequential Policy Value Using Softer Surrogates

    Authors: Hyunji Nam, Allen Nie, Ge Gao, Vasilis Syrgkanis, Emma Brunskill

    Abstract: Off-policy policy evaluation (OPE) estimates the outcome of a new policy using historical data collected from a different policy. However, existing OPE methods cannot handle cases when the new policy introduces novel actions. This issue commonly occurs in real-world domains, like healthcare, as new drugs and treatments are continuously developed. Novel actions necessitate on-policy data collection… ▽ More

    Submitted 2 February, 2025; v1 submitted 29 December, 2024; originally announced December 2024.

    Comments: 24 pages, 1 figure

  16. arXiv:2412.13648  [pdf, ps, other

    nucl-ex

    Model-independent measurement of isospin diffusion in Ni-Ni systems at intermediate energy

    Authors: C. Ciampi, J. D. Frankland, D. Gruyer, N. Le Neindre, S. Mallik, R. Bougault, A. Chbihi, L. Baldesi, S. Barlini, E. Bonnet, B. Borderie, A. Camaiani, G. Casini, I. Dekhissi, D. Dell'Aquila, J. A. Dueñas, Q. Fable, F. Gramegna, C. Gouyet, M. Henri, B. Hong, S. Kim, A. Kordyasz, T. Kozik, M. J. Kweon , et al. (16 additional authors not shown)

    Abstract: In this work we provide a model-independent experimental evaluation of the degree of isospin equilibration taking place in $^{58,64}$Ni+$^{58,64}$Ni collisions at 32 MeV/nucleon across varying reaction centralities. This result has been obtained by combining the complementary information provided by two different datasets, sharing common characteristics. The first dataset has been acquired with th… ▽ More

    Submitted 18 December, 2024; originally announced December 2024.

  17. arXiv:2412.12306  [pdf, ps, other

    eess.SY

    Ultra-wideband Double-Directionally Resolved Channel Measurements of Line-of-Sight Microcellular Scenarios in the Upper Mid-band

    Authors: Naveed A. Abbasi, Kelvin Arana, Jorge Gomez-Ponce, Tathagat Pal, Vikram Vasudevan, Atulya Bist, Omer Gokalp Serbetci, Young Han Nam, Charlie Zhang, Andreas F. Molisch

    Abstract: The growing demand for higher data rates and expanded bandwidth is driving the exploration of new frequency ranges, including the upper mid-band spectrum (6-24 GHz), which is a promising candidate for future Frequency Range 3 (FR3) applications. This paper presents ultra-wideband double-directional channel measurements in line-of-sight microcellular scenarios within the upper mid-band spectrum (6-… ▽ More

    Submitted 16 December, 2024; originally announced December 2024.

  18. arXiv:2412.08662  [pdf, other

    physics.ins-det nucl-ex physics.acc-ph

    Performance of the prototype beam drift chamber for LAMPS at RAON with proton and Carbon-12 beams

    Authors: H. Kim, Y. Bae, C. Heo, J. Seo, J. Hwang, D. H. Moon, D. S. Ahn, J. K. Ahn, J. Bae, J. Bok, Y. Cheon, S. W. Choi, S. Do, B. Hong, S. -W. Hong, J. Huh, S. Hwang, Y. Jang, B. Kang, A. Kim, B. Kim, C. Kim, E. -J. Kim, G. Kim, G. Kim , et al. (23 additional authors not shown)

    Abstract: Beam Drift Chamber (BDC) is designed to reconstruct the trajectories of incident rare isotope beams provided by RAON (Rare isotope Accelerator complex for ON-line experiments) into the experimental target of LAMPS (Large Acceptance Multi-Purpose Spectrometer). To conduct the performance test of the BDC, the prototype BDC (pBDC) is manufactured and evaluated with the high energy ion beams from HIMA… ▽ More

    Submitted 6 December, 2024; originally announced December 2024.

    Comments: 13 pages, 15 figures

    Journal ref: JINST 19 (2024) P12008

  19. arXiv:2411.19341  [pdf, other

    cs.LG cs.AI

    An Adversarial Learning Approach to Irregular Time-Series Forecasting

    Authors: Heejeong Nam, Jihyun Kim, Jimin Yeom

    Abstract: Forecasting irregular time series presents significant challenges due to two key issues: the vulnerability of models to mean regression, driven by the noisy and complex nature of the data, and the limitations of traditional error-based evaluation metrics, which fail to capture meaningful patterns and penalize unrealistic forecasts. These problems result in forecasts that often misalign with human… ▽ More

    Submitted 28 November, 2024; originally announced November 2024.

    Comments: Accepted to AdvML-Frontiers Workshop @ NeurIPS 2024

  20. arXiv:2411.15540  [pdf, other

    cs.CV cs.AI cs.LG eess.IV

    Optical-Flow Guided Prompt Optimization for Coherent Video Generation

    Authors: Hyelin Nam, Jaemin Kim, Dohun Lee, Jong Chul Ye

    Abstract: While text-to-video diffusion models have made significant strides, many still face challenges in generating videos with temporal consistency. Within diffusion frameworks, guidance techniques have proven effective in enhancing output quality during inference; however, applying these methods to video diffusion models introduces additional complexity of handling computations across entire sequences.… ▽ More

    Submitted 23 March, 2025; v1 submitted 23 November, 2024; originally announced November 2024.

    Comments: CVPR 2025 (poster); project page: https://motionprompt.github.io/

  21. arXiv:2411.14137  [pdf, other

    cs.CV cs.CL

    VAGUE: Visual Contexts Clarify Ambiguous Expressions

    Authors: Heejeong Nam, Jinwoo Ahn, Keummin Ka, Jiwan Chung, Youngjae Yu

    Abstract: Human communication often relies on visual cues to resolve ambiguity. While humans can intuitively integrate these cues, AI systems often find it challenging to engage in sophisticated multimodal reasoning. We introduce VAGUE, a benchmark evaluating multimodal AI systems' ability to integrate visual context for intent disambiguation. VAGUE consists of 1.6K ambiguous textual expressions, each paire… ▽ More

    Submitted 11 March, 2025; v1 submitted 21 November, 2024; originally announced November 2024.

    Comments: 31 pages

  22. arXiv:2410.14902  [pdf, other

    cs.IT

    Modeling and Analysis of Hybrid GEO-LEO Satellite Networks

    Authors: Dong-Hyun Jung, Hongjae Nam, Junil Choi, David J. Love

    Abstract: As the number of low Earth orbit (LEO) satellites rapidly increases, the consideration of frequency sharing or cooperation between geosynchronous Earth orbit (GEO) and LEO satellites is gaining attention. In this paper, we consider a hybrid GEO-LEO satellite network where GEO and LEO satellites are distributed according to independent Poisson point processes (PPPs) and share the same frequency res… ▽ More

    Submitted 18 October, 2024; originally announced October 2024.

    Comments: 5 pages, 4 figures, 1 table, submitted to IEEE Transactions on Vehicular Technology

  23. Scoto-seesaw model implied by flavor-dependent Abelian gauge charge

    Authors: Duong Van Loi, N. T. Duy, Cao H. Nam, Phung Van Dong

    Abstract: Assuming fundamental fermions possess a new Abelian gauge charge that depends on flavors of both quark and lepton, we obtain a simple extension of the Standard Model, which reveals some new physics insights. The new gauge charge anomaly cancellation not only explains the existence of just three fermion generations as observed but also requires the presence of a unique right-handed neutrino $ν_R$ w… ▽ More

    Submitted 2 February, 2025; v1 submitted 10 September, 2024; originally announced September 2024.

    Comments: 41 pages, 10 figures, 5 tables; revised version, published in EPJC

    Journal ref: Eur. Phys. J. C 85 (2025) 109

  24. arXiv:2408.01040  [pdf, other

    cs.DC cs.CR cs.CV cs.LG

    Privacy-Preserving Split Learning with Vision Transformers using Patch-Wise Random and Noisy CutMix

    Authors: Seungeun Oh, Sihun Baek, Jihong Park, Hyelin Nam, Praneeth Vepakomma, Ramesh Raskar, Mehdi Bennis, Seong-Lyun Kim

    Abstract: In computer vision, the vision transformer (ViT) has increasingly superseded the convolutional neural network (CNN) for improved accuracy and robustness. However, ViT's large model sizes and high sample complexity make it difficult to train on resource-constrained edge devices. Split learning (SL) emerges as a viable solution, leveraging server-side resources to train ViTs while utilizing private… ▽ More

    Submitted 2 August, 2024; originally announced August 2024.

    Comments: 23 pages, 11 figures, 8 tables, to be published in Transactions on Machine Learning Research (TMLR)

  25. arXiv:2407.21410  [pdf, other

    hep-ph gr-qc hep-th

    Brane-vector dark matter and its connection to inflation and primordial gravitational waves

    Authors: Cao H. Nam, Tran N. Hung

    Abstract: The scalar mode describing the fluctuation of the 3-brane (the observable universe) in a five-dimensional bulk spacetime compactified on a circle is absorbed by the Kaluza-Klein U(1) gauge field, leading to a massive brane-vector living on the 3-brane. The brane-vector can be responsible for dark matter because it is odd under a $\mathrm{Z}_2$ symmetry, neutral under the Standard Model (SM) symmet… ▽ More

    Submitted 31 July, 2024; originally announced July 2024.

    Comments: 20 pages, 6 figures

  26. arXiv:2407.09122  [pdf, other

    hep-th gr-qc

    Topological equivalence and phase transition rate in holographic thermodynamics of regularized Maxwell theory

    Authors: Tran N. Hung, Cao H. Nam

    Abstract: Utilizing the holographic dictionary from the proposal that treats Newton's constant as a thermodynamic variable, we establish a thermodynamic topological equivalence between the AdS black holes in the bulk and the thermal states in the dual CFT. The findings further reveal that the thermodynamic topological characteristics of the RegMax AdS black holes are strongly influenced by the characteristi… ▽ More

    Submitted 20 August, 2024; v1 submitted 12 July, 2024; originally announced July 2024.

    Comments: 18 pages, 7 figures

  27. arXiv:2407.08073  [pdf, other

    cs.RO cs.AI cs.LG

    NDST: Neural Driving Style Transfer for Human-Like Vision-Based Autonomous Driving

    Authors: Donghyun Kim, Aws Khalil, Haewoon Nam, Jaerock Kwon

    Abstract: Autonomous Vehicles (AV) and Advanced Driver Assistant Systems (ADAS) prioritize safety over comfort. The intertwining factors of safety and comfort emerge as pivotal elements in ensuring the effectiveness of Autonomous Driving (AD). Users often experience discomfort when AV or ADAS drive the vehicle on their behalf. Providing a personalized human-like AD experience, tailored to match users' uniqu… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

    Comments: 9 pages, 11 figures

  28. arXiv:2407.03674  [pdf, other

    cs.LG

    Short-Long Policy Evaluation with Novel Actions

    Authors: Hyunji Alex Nam, Yash Chandak, Emma Brunskill

    Abstract: From incorporating LLMs in education, to identifying new drugs and improving ways to charge batteries, innovators constantly try new strategies in search of better long-term outcomes for students, patients and consumers. One major bottleneck in this innovation cycle is the amount of time it takes to observe the downstream effects of a decision policy that incorporates new interventions. The key qu… ▽ More

    Submitted 9 July, 2024; v1 submitted 4 July, 2024; originally announced July 2024.

    Comments: Added references for related work

  29. arXiv:2406.15725  [pdf, other

    eess.AS cs.SD

    Self Training and Ensembling Frequency Dependent Networks with Coarse Prediction Pooling and Sound Event Bounding Boxes

    Authors: Hyeonuk Nam, Deokki Min, Seungdeok Choi, Inhan Choi, Yong-Hwa Park

    Abstract: To tackle sound event detection (SED), we propose frequency dependent networks (FreDNets), which heavily leverage frequency-dependent methods. We apply frequency warping and FilterAugment, which are frequency-dependent data augmentation methods. The model architecture consists of 3 branches: audio teacher-student transformer (ATST) branch, BEATs branch and CNN branch including either partial dilat… ▽ More

    Submitted 19 September, 2024; v1 submitted 22 June, 2024; originally announced June 2024.

    Comments: DCASE 2024 Challenge Task 4 technical report, DCASE 2024 Workshop accepted

  30. arXiv:2406.13312  [pdf, other

    eess.AS cs.SD

    Pushing the Limit of Sound Event Detection with Multi-Dilated Frequency Dynamic Convolution

    Authors: Hyeonuk Nam, Yong-Hwa Park

    Abstract: Frequency dynamic convolution (FDY conv) has been a milestone in the sound event detection (SED) field, but it involves a substantial increase in model size due to multiple basis kernels. In this work, we propose partial frequency dynamic convolution (PFD conv), which concatenates outputs by conventional 2D convolution and FDY conv as static and dynamic branches respectively. PFD-CRNN with proport… ▽ More

    Submitted 19 September, 2024; v1 submitted 19 June, 2024; originally announced June 2024.

    Comments: Submitted to ICASSP 2025

  31. arXiv:2406.08070  [pdf, other

    cs.CV cs.AI cs.LG

    CFG++: Manifold-constrained Classifier Free Guidance for Diffusion Models

    Authors: Hyungjin Chung, Jeongsol Kim, Geon Yeong Park, Hyelin Nam, Jong Chul Ye

    Abstract: Classifier-free guidance (CFG) is a fundamental tool in modern diffusion models for text-guided generation. Although effective, CFG has notable drawbacks. For instance, DDIM with CFG lacks invertibility, complicating image editing; furthermore, high guidance scales, essential for high-quality outputs, frequently result in issues like mode collapse. Contrary to the widespread belief that these are… ▽ More

    Submitted 12 September, 2024; v1 submitted 12 June, 2024; originally announced June 2024.

    Comments: 25 pages, 21 figures. Project Page: https://cfgpp-diffusion.github.io/

  32. arXiv:2406.05341  [pdf, other

    eess.AS cs.SD

    Diversifying and Expanding Frequency-Adaptive Convolution Kernels for Sound Event Detection

    Authors: Hyeonuk Nam, Seong-Hu Kim, Deokki Min, Junhyeok Lee, Yong-Hwa Park

    Abstract: Frequency dynamic convolution (FDY conv) has shown the state-of-the-art performance in sound event detection (SED) using frequency-adaptive kernels obtained by frequency-varying combination of basis kernels. However, FDY conv lacks an explicit mean to diversify frequency-adaptive kernels, potentially limiting the performance. In addition, size of basis kernels is limited while time-frequency patte… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

    Comments: Accepted to INTERSPEECH 2024

  33. arXiv:2406.03494  [pdf, other

    cs.LG math.NA stat.ML

    Solving Poisson Equations using Neural Walk-on-Spheres

    Authors: Hong Chul Nam, Julius Berner, Anima Anandkumar

    Abstract: We propose Neural Walk-on-Spheres (NWoS), a novel neural PDE solver for the efficient solution of high-dimensional Poisson equations. Leveraging stochastic representations and Walk-on-Spheres methods, we develop novel losses for neural networks based on the recursive solution of Poisson equations on spheres inside the domain. The resulting method is highly parallelizable and does not require spati… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

    Comments: Accepted at ICML 2024

  34. arXiv:2405.11094  [pdf, other

    cs.RO

    YORI: Autonomous Cooking System Utilizing a Modular Robotic Kitchen and a Dual-Arm Proprioceptive Manipulator

    Authors: Donghun Noh, Hyunwoo Nam, Kyle Gillespie, Yeting Liu, Dennis Hong

    Abstract: This article introduces the development and implementation of the Yummy Operations Robot Initiative (YORI), an innovative, autonomous robotic cooking system. YORI marks a major advancement in culinary automation, adept at handling a diverse range of cooking tasks, capable of preparing multiple dishes simultaneously, and offering the flexibility to adapt to an extensive array of culinary activities… ▽ More

    Submitted 17 May, 2024; originally announced May 2024.

    Comments: This manuscript is 13 pages long, includes 10 figures, and cites 20 references. It is to be submitted

  35. arXiv:2405.02499  [pdf, other

    cs.CR cs.AR

    DRAMScope: Uncovering DRAM Microarchitecture and Characteristics by Issuing Memory Commands

    Authors: Hwayong Nam, Seungmin Baek, Minbok Wi, Michael Jaemin Kim, Jaehyun Park, Chihun Song, Nam Sung Kim, Jung Ho Ahn

    Abstract: The demand for precise information on DRAM microarchitectures and error characteristics has surged, driven by the need to explore processing in memory, enhance reliability, and mitigate security vulnerability. Nonetheless, DRAM manufacturers have disclosed only a limited amount of information, making it difficult to find specific information on their DRAM microarchitectures. This paper addresses t… ▽ More

    Submitted 3 May, 2024; originally announced May 2024.

    Comments: To appear at the 51st IEEE/ACM International Symposium on Computer Architecture (ISCA)

  36. arXiv:2404.04819  [pdf, other

    cs.CV

    Joint Reconstruction of 3D Human and Object via Contact-Based Refinement Transformer

    Authors: Hyeongjin Nam, Daniel Sungho Jung, Gyeongsik Moon, Kyoung Mu Lee

    Abstract: Human-object contact serves as a strong cue to understand how humans physically interact with objects. Nevertheless, it is not widely explored to utilize human-object contact information for the joint reconstruction of 3D human and object from a single image. In this work, we present a novel joint 3D human-object reconstruction method (CONTHO) that effectively exploits contact information between… ▽ More

    Submitted 7 April, 2024; originally announced April 2024.

    Comments: Published at CVPR 2024, 19 pages including the supplementary material

  37. arXiv:2403.16652  [pdf, other

    cs.RO eess.SY

    Trajectory Planning of Robotic Manipulator in Dynamic Environment Exploiting DRL

    Authors: Osama Ahmad, Zawar Hussain, Hammad Naeem

    Abstract: This study is about the implementation of a reinforcement learning algorithm in the trajectory planning of manipulators. We have a 7-DOF robotic arm to pick and place the randomly placed block at a random target point in an unknown environment. The obstacle is randomly moving which creates a hurdle in picking the object. The objective of the robot is to avoid the obstacle and pick the block with c… ▽ More

    Submitted 25 March, 2024; originally announced March 2024.

    Comments: Accepted in ICIESTR-2024

  38. arXiv:2403.08322  [pdf, other

    gr-qc hep-th

    Generalized free energy and thermodynamic phases of black holes in the gauged Kaluza-Klein theory

    Authors: Tran N. Hung, Cao H. Nam

    Abstract: In the context of the generalized (off-shell) free energy, we explore the phase emergence and corresponding phase transitions of charged dilaton $\text{AdS}$ black holes in the gauged Kaluza-Klein (KK) theory where the KK vector field is gauged such that the fermionic fields are charged under the U(1)$_{\text{KK}}$ gauge group. The black hole solutions are asymptotic to the AdS$_D$ geometry and ca… ▽ More

    Submitted 13 March, 2024; originally announced March 2024.

    Comments: 25 pages, 15 figures

  39. arXiv:2403.08187  [pdf, other

    cs.CL cs.SD eess.AS

    Automatic Speech Recognition (ASR) for the Diagnosis of pronunciation of Speech Sound Disorders in Korean children

    Authors: Taekyung Ahn, Yeonjung Hong, Younggon Im, Do Hyung Kim, Dayoung Kang, Joo Won Jeong, Jae Won Kim, Min Jung Kim, Ah-ra Cho, Dae-Hyun Jang, Hosung Nam

    Abstract: This study presents a model of automatic speech recognition (ASR) designed to diagnose pronunciation issues in children with speech sound disorders (SSDs) to replace manual transcriptions in clinical procedures. Since ASR models trained for general purposes primarily predict input speech into real words, employing a well-known high-performance ASR model for evaluating pronunciation in children wit… ▽ More

    Submitted 12 March, 2024; originally announced March 2024.

    Comments: 12 pages, 2 figures

    ACM Class: I.2.7

  40. arXiv:2402.10595  [pdf, other

    cs.CV

    Compact and De-biased Negative Instance Embedding for Multi-Instance Learning on Whole-Slide Image Classification

    Authors: Joohyung Lee, Heejeong Nam, Kwanhyung Lee, Sangchul Hahn

    Abstract: Whole-slide image (WSI) classification is a challenging task because 1) patches from WSI lack annotation, and 2) WSI possesses unnecessary variability, e.g., stain protocol. Recently, Multiple-Instance Learning (MIL) has made significant progress, allowing for classification based on slide-level, rather than patch-level, annotations. However, existing MIL methods ignore that all patches from norma… ▽ More

    Submitted 16 February, 2024; originally announced February 2024.

    Comments: Accepted to ICASSP 2024

  41. Study of quasi-projectile properties at Fermi energies in 48Ca projectile systems

    Authors: S. Upadhyaya, K. Mazurek, T. Kozik, D. Gruyer, G. Casini, S. Piantelli, L. Baldesi, S. Barlini, B. Borderie, R. Bougault, A. Camaiani, C. Ciampi, M. Cicerchia, M. Ciemala, D. Dell Aquila, J. A. Duenas, Q. Fable, J. D. Frankland, F. Gramegna, M. Henri, B. Hong, A. Kordyasz, M. J. Kweon, N. Le Neindre, I. Lombardo , et al. (10 additional authors not shown)

    Abstract: The emission of the pre-equilibrium particles during nuclear collisions at moderate beam energies is still an open question. This influences the properties of the compound nucleus but also changes the interpretation of the quasi-fission process. A systematic analysis of the data obtained by the FAZIA collaboration during a recent experiment with a neutron rich projectile is presented. The full ran… ▽ More

    Submitted 14 February, 2024; originally announced February 2024.

    Comments: 10 pages, 10 figures

  42. arXiv:2401.04433  [pdf, other

    hep-th gr-qc hep-ph

    Non-singular cosmology from non-supersymmetric AdS instability conjecture

    Authors: Cao H. Nam

    Abstract: We show that the non-supersymmetric AdS instability conjecture can point to how quantum gravity removes the initial Big Bang singularity, leading to a potential resolution for the past-incomplete inflationary universe. From the constraints on the dynamics of the universe realized as the nucleation of a thin-wall bubble mediating the decay of the non-supersymmetric AdS vacuum, we find the critical… ▽ More

    Submitted 12 August, 2024; v1 submitted 9 January, 2024; originally announced January 2024.

    Comments: 6 pages, 4 figures, new discussions added, Fig.2 modified, references added, version to be published in PRD

  43. arXiv:2401.04143  [pdf, other

    cs.CV

    RHOBIN Challenge: Reconstruction of Human Object Interaction

    Authors: Xianghui Xie, Xi Wang, Nikos Athanasiou, Bharat Lal Bhatnagar, Chun-Hao P. Huang, Kaichun Mo, Hao Chen, Xia Jia, Zerui Zhang, Liangxian Cui, Xiao Lin, Bingqiao Qian, Jie Xiao, Wenfei Yang, Hyeongjin Nam, Daniel Sungho Jung, Kihoon Kim, Kyoung Mu Lee, Otmar Hilliges, Gerard Pons-Moll

    Abstract: Modeling the interaction between humans and objects has been an emerging research direction in recent years. Capturing human-object interaction is however a very challenging task due to heavy occlusion and complex dynamics, which requires understanding not only 3D human pose, and object pose but also the interaction between them. Reconstruction of 3D humans and objects has been two separate resear… ▽ More

    Submitted 7 January, 2024; originally announced January 2024.

    Comments: 14 pages, 5 tables, 7 figure. Technical report of the CVPR'23 workshop: RHOBIN challenge (https://rhobin-challenge.github.io/)

  44. arXiv:2312.15924  [pdf, other

    cs.IT eess.SP

    Modeling and Analysis of GEO Satellite Networks

    Authors: Dong-Hyun Jung, Hongjae Nam, Junil Choi, David J. Love

    Abstract: The extensive coverage offered by satellites makes them effective in enhancing service continuity for users on dynamic airborne and maritime platforms, such as airplanes and ships. In particular, geosynchronous Earth orbit (GEO) satellites ensure stable connectivity for terrestrial users due to their stationary characteristics when observed from Earth. This paper introduces a novel approach to mod… ▽ More

    Submitted 26 December, 2023; originally announced December 2023.

    Comments: 12 pages, 9 figures, submitted to IEEE Transactions on Wireless Communications

  45. Isospin diffusion from $^{40,48}$Ca$+^{40,48}$Ca experimental data at Fermi energies: Direct comparisons with transport model calculations

    Authors: Q. Fable, L. Baldesi, S. Barlini, Eric Bonnet, Bernard Borderie, Remi Bougault, A. Camaiani, G. Casini, A. Chbihi, Caterina Ciampi, J. A. Dueñas, J. D. Frankland, T. Genard, Diego D. Gruyer, Maxime Henri, Byungsik Hong, S. Kim, A. J. Kordyasz, T. Kozik, Arnaud Le Fèvre, Nicolas Le Neindre, Ivano Lombardo, Olivier Lopez, T. Marchi, Paola Marini , et al. (8 additional authors not shown)

    Abstract: This article presents an investigation of isospin equilibration in cross-bombarding $^{40,48}$Ca$+^{40,48}$Ca reactions at 35 MeV/nucleon, by comparing experimental data with filtered transport model calculations. Isospin diffusion is studied using the evolution of the isospin transport ratio with centrality. The asymmetry parameter $δ=(N-Z)/A$ of the quasiprojectile (QP) residue is used as isospi… ▽ More

    Submitted 6 June, 2024; v1 submitted 4 December, 2023; originally announced December 2023.

    Journal ref: Physical Review C, 109 (064605)

  46. arXiv:2311.18608  [pdf, other

    cs.CV cs.AI cs.LG

    Contrastive Denoising Score for Text-guided Latent Diffusion Image Editing

    Authors: Hyelin Nam, Gihyun Kwon, Geon Yeong Park, Jong Chul Ye

    Abstract: With the remarkable advent of text-to-image diffusion models, image editing methods have become more diverse and continue to evolve. A promising recent approach in this realm is Delta Denoising Score (DDS) - an image editing technique based on Score Distillation Sampling (SDS) framework that leverages the rich generative prior of text-to-image diffusion models. However, relying solely on the diffe… ▽ More

    Submitted 1 April, 2024; v1 submitted 30 November, 2023; originally announced November 2023.

    Comments: CVPR 2024 (poster); Project page: https://hyelinnam.github.io/CDS/

  47. arXiv:2311.13384  [pdf, other

    cs.CV

    LucidDreamer: Domain-free Generation of 3D Gaussian Splatting Scenes

    Authors: Jaeyoung Chung, Suyoung Lee, Hyeongjin Nam, Jaerin Lee, Kyoung Mu Lee

    Abstract: With the widespread usage of VR devices and contents, demands for 3D scene generation techniques become more popular. Existing 3D scene generation models, however, limit the target scene to specific domain, primarily due to their training strategies using 3D scan dataset that is far from the real-world. To address such limitation, we propose LucidDreamer, a domain-free scene generation pipeline by… ▽ More

    Submitted 23 November, 2023; v1 submitted 22 November, 2023; originally announced November 2023.

    Comments: Project page: https://luciddreamer-cvlab.github.io/

  48. arXiv:2311.06567  [pdf, other

    cs.LG cs.AI cs.CV

    SCADI: Self-supervised Causal Disentanglement in Latent Variable Models

    Authors: Heejeong Nam

    Abstract: Causal disentanglement has great potential for capturing complex situations. However, there is a lack of practical and efficient approaches. It is already known that most unsupervised disentangling methods are unable to produce identifiable results without additional information, often leading to randomly disentangled output. Therefore, most existing models for disentangling are weakly supervised,… ▽ More

    Submitted 11 November, 2023; originally announced November 2023.

    Comments: 12 pages, 12 figures

  49. arXiv:2311.02010  [pdf, other

    cs.CY

    A cast of thousands: How the IDEAS Productivity project has advanced software productivity and sustainability

    Authors: Lois Curfman McInnes, Michael Heroux, David E. Bernholdt, Anshu Dubey, Elsa Gonsiorowski, Rinku Gupta, Osni Marques, J. David Moulton, Hai Ah Nam, Boyana Norris, Elaine M. Raybourn, Jim Willenbring, Ann Almgren, Ross Bartlett, Kita Cranfill, Stephen Fickas, Don Frederick, William Godoy, Patricia Grubel, Rebecca Hartman-Baker, Axel Huebl, Rose Lynch, Addi Malviya Thakur, Reed Milewicz, Mark C. Miller , et al. (9 additional authors not shown)

    Abstract: Computational and data-enabled science and engineering are revolutionizing advances throughout science and society, at all scales of computing. For example, teams in the U.S. DOE Exascale Computing Project have been tackling new frontiers in modeling, simulation, and analysis by exploiting unprecedented exascale computing capabilities-building an advanced software ecosystem that supports next-gene… ▽ More

    Submitted 16 February, 2024; v1 submitted 3 November, 2023; originally announced November 2023.

    Comments: 12 pages, 1 figure

  50. Victima: Drastically Increasing Address Translation Reach by Leveraging Underutilized Cache Resources

    Authors: Konstantinos Kanellopoulos, Hong Chul Nam, F. Nisa Bostanci, Rahul Bera, Mohammad Sadrosadati, Rakesh Kumar, Davide-Basilio Bartolini, Onur Mutlu

    Abstract: Address translation is a performance bottleneck in data-intensive workloads due to large datasets and irregular access patterns that lead to frequent high-latency page table walks (PTWs). PTWs can be reduced by using (i) large hardware TLBs or (ii) large software-managed TLBs. Unfortunately, both solutions have significant drawbacks: increased access latency, power and area (for hardware TLBs), an… ▽ More

    Submitted 5 January, 2024; v1 submitted 6 October, 2023; originally announced October 2023.

    Comments: To appear in 56th IEEE/ACM International Symposium on Microarchitecture (MICRO), 2023

    ACM Class: C.0