Search | arXiv e-print repository

A multiscale cavity method for sublinear-rank symmetric matrix factorization

Authors: Jean Barbier, Justin Ko, Anas A. Rahman

Abstract: We consider a statistical model for symmetric matrix factorization with additive Gaussian noise in the high-dimensional regime where the rank $M$ of the signal matrix to infer scales with its size $N$ as $M={\rm o}(\sqrt{\ln N})$. Allowing for an $N$-dependent rank offers new challenges and requires new methods. Working in the Bayes-optimal setting, we show that whenever the signal has i.i.d.~entr… ▽ More We consider a statistical model for symmetric matrix factorization with additive Gaussian noise in the high-dimensional regime where the rank $M$ of the signal matrix to infer scales with its size $N$ as $M={\rm o}(\sqrt{\ln N})$. Allowing for an $N$-dependent rank offers new challenges and requires new methods. Working in the Bayes-optimal setting, we show that whenever the signal has i.i.d.~entries, the limiting mutual information between signal and data is given by a variational formula involving a rank-one replica symmetric potential. In other words, from the information-theoretic perspective, the case of a (slowly) growing rank is the same as when $M=1$ (namely, the standard spiked Wigner model). The proof is primarily based on a novel multiscale cavity method allowing for growing rank along with some information-theoretic identities on worst noise for the vector Gaussian channel. We believe that the cavity method developed here will play a role in the analysis of a broader class of inference and spin models where the degrees of freedom are large arrays instead of vectors. △ Less

Submitted 20 March, 2025; v1 submitted 11 March, 2024; originally announced March 2024.

Comments: 64 pages. Filled out proof details, with one step being more involved than initially thought and resulting in changes to the main theorem

arXiv:2403.04234 [pdf, other]

Fundamental limits of Non-Linear Low-Rank Matrix Estimation

Authors: Pierre Mergny, Justin Ko, Florent Krzakala, Lenka Zdeborová

Abstract: We consider the task of estimating a low-rank matrix from non-linear and noisy observations. We prove a strong universality result showing that Bayes-optimal performances are characterized by an equivalent Gaussian model with an effective prior, whose parameters are entirely determined by an expansion of the non-linear function. In particular, we show that to reconstruct the signal accurately, one… ▽ More We consider the task of estimating a low-rank matrix from non-linear and noisy observations. We prove a strong universality result showing that Bayes-optimal performances are characterized by an equivalent Gaussian model with an effective prior, whose parameters are entirely determined by an expansion of the non-linear function. In particular, we show that to reconstruct the signal accurately, one requires a signal-to-noise ratio growing as $N^{\frac 12 (1-1/k_F)}$, where $k_F$ is the first non-zero Fisher information coefficient of the function. We provide asymptotic characterization for the minimal achievable mean squared error (MMSE) and an approximate message-passing algorithm that reaches the MMSE under conditions analogous to the linear version of the problem. We also provide asymptotic errors achieved by methods such as principal component analysis combined with Bayesian denoising, and compare them with Bayes-optimal MMSE. △ Less

Submitted 7 March, 2024; originally announced March 2024.

Comments: 42 pages, 2 figures

arXiv:2403.04134 [pdf, other]

An Adaptable, Safe, and Portable Robot-Assisted Feeding System

Authors: Ethan Kroll Gordon, Rajat Kumar Jenamani, Amal Nanavati, Ziang Liu, Haya Bolotski, Raida Karim, Daniel Stabile, Atharva Kashyap, Bernie Hao Zhu, Xilai Dai, Tyler Schrenk, Jonathan Ko, Taylor Kessler Faulkner, Tapomayukh Bhattacharjee, Siddhartha Srinivasa

Abstract: We demonstrate a robot-assisted feeding system that enables people with mobility impairments to feed themselves. Our system design embodies Safety, Portability, and User Control, with comprehensive full-stack safety checks, the ability to be mounted on and powered by any powered wheelchair, and a custom web-app allowing care-recipients to leverage their own assistive devices for robot control. For… ▽ More We demonstrate a robot-assisted feeding system that enables people with mobility impairments to feed themselves. Our system design embodies Safety, Portability, and User Control, with comprehensive full-stack safety checks, the ability to be mounted on and powered by any powered wheelchair, and a custom web-app allowing care-recipients to leverage their own assistive devices for robot control. For bite acquisition, we leverage multi-modal online learning to tractably adapt to unseen food types. For bite transfer, we leverage real-time mouth perception and interaction-aware control. Co-designed with community researchers, our system has been validated through multiple end-user studies. △ Less

Submitted 6 March, 2024; originally announced March 2024.

Comments: HRI 2024 Demo; Corrected inaccurate author ordering in ACM DL which occurred due to formatting issues

arXiv:2403.03695 [pdf, other]

Spectral Phase Transition and Optimal PCA in Block-Structured Spiked models

Authors: Pierre Mergny, Justin Ko, Florent Krzakala

Abstract: We discuss the inhomogeneous spiked Wigner model, a theoretical framework recently introduced to study structured noise in various learning scenarios, through the prism of random matrix theory, with a specific focus on its spectral properties. Our primary objective is to find an optimal spectral method and to extend the celebrated \cite{BBP} (BBP) phase transition criterion -- well-known in the ho… ▽ More We discuss the inhomogeneous spiked Wigner model, a theoretical framework recently introduced to study structured noise in various learning scenarios, through the prism of random matrix theory, with a specific focus on its spectral properties. Our primary objective is to find an optimal spectral method and to extend the celebrated \cite{BBP} (BBP) phase transition criterion -- well-known in the homogeneous case -- to our inhomogeneous, block-structured, Wigner model. We provide a thorough rigorous analysis of a transformed matrix and show that the transition for the appearance of 1) an outlier outside the bulk of the limiting spectral distribution and 2) a positive overlap between the associated eigenvector and the signal, occurs precisely at the optimal threshold, making the proposed spectral method optimal within the class of iterative methods for the inhomogeneous Wigner problem. △ Less

Submitted 6 March, 2024; originally announced March 2024.

Comments: 26 pages, 2 figures

Journal ref: Proceedings of the 41st International Conference on Machine Learning, PMLR 235:35470-35491, 2024

arXiv:2402.18545 [pdf, other]

doi 10.1001/jamanetworkopen.2024.46615

Crowdsourcing Dermatology Images with Google Search Ads: Creating a Real-World Skin Condition Dataset

Authors: Abbi Ward, Jimmy Li, Julie Wang, Sriram Lakshminarasimhan, Ashley Carrick, Bilson Campana, Jay Hartford, Pradeep Kumar S, Tiya Tiyasirichokchai, Sunny Virmani, Renee Wong, Yossi Matias, Greg S. Corrado, Dale R. Webster, Dawn Siegel, Steven Lin, Justin Ko, Alan Karthikesalingam, Christopher Semturs, Pooja Rao

Abstract: Background: Health datasets from clinical sources do not reflect the breadth and diversity of disease in the real world, impacting research, medical education, and artificial intelligence (AI) tool development. Dermatology is a suitable area to develop and test a new and scalable method to create representative health datasets. Methods: We used Google Search advertisements to invite contribution… ▽ More Background: Health datasets from clinical sources do not reflect the breadth and diversity of disease in the real world, impacting research, medical education, and artificial intelligence (AI) tool development. Dermatology is a suitable area to develop and test a new and scalable method to create representative health datasets. Methods: We used Google Search advertisements to invite contributions to an open access dataset of images of dermatology conditions, demographic and symptom information. With informed contributor consent, we describe and release this dataset containing 10,408 images from 5,033 contributions from internet users in the United States over 8 months starting March 2023. The dataset includes dermatologist condition labels as well as estimated Fitzpatrick Skin Type (eFST) and Monk Skin Tone (eMST) labels for the images. Results: We received a median of 22 submissions/day (IQR 14-30). Female (66.72%) and younger (52% < age 40) contributors had a higher representation in the dataset compared to the US population, and 32.6% of contributors reported a non-White racial or ethnic identity. Over 97.5% of contributions were genuine images of skin conditions. Dermatologist confidence in assigning a differential diagnosis increased with the number of available variables, and showed a weaker correlation with image sharpness (Spearman's P values <0.001 and 0.01 respectively). Most contributions were short-duration (54% with onset < 7 days ago ) and 89% were allergic, infectious, or inflammatory conditions. eFST and eMST distributions reflected the geographical origin of the dataset. The dataset is available at github.com/google-research-datasets/scin . Conclusion: Search ads are effective at crowdsourcing images of health conditions. The SCIN dataset bridges important gaps in the availability of representative images of common skin conditions. △ Less

Submitted 28 February, 2024; originally announced February 2024.

Journal ref: JAMA Network Open (2024)

arXiv:2402.18293 [pdf, other]

Continuous Memory Representation for Anomaly Detection

Authors: Joo Chan Lee, Taejune Kim, Eunbyung Park, Simon S. Woo, Jong Hwan Ko

Abstract: There have been significant advancements in anomaly detection in an unsupervised manner, where only normal images are available for training. Several recent methods aim to detect anomalies based on a memory, comparing or reconstructing the input with directly stored normal features (or trained features with normal images). However, such memory-based approaches operate on a discrete feature space i… ▽ More There have been significant advancements in anomaly detection in an unsupervised manner, where only normal images are available for training. Several recent methods aim to detect anomalies based on a memory, comparing or reconstructing the input with directly stored normal features (or trained features with normal images). However, such memory-based approaches operate on a discrete feature space implemented by the nearest neighbor or attention mechanism, suffering from poor generalization or an identity shortcut issue outputting the same as input, respectively. Furthermore, the majority of existing methods are designed to detect single-class anomalies, resulting in unsatisfactory performance when presented with multiple classes of objects. To tackle all of the above challenges, we propose CRAD, a novel anomaly detection method for representing normal features within a "continuous" memory, enabled by transforming spatial features into coordinates and mapping them to continuous grids. Furthermore, we carefully design the grids tailored for anomaly detection, representing both local and global normal features and fusing them effectively. Our extensive experiments demonstrate that CRAD successfully generalizes the normal features and mitigates the identity shortcut, furthermore, CRAD effectively handles diverse classes in a single model thanks to the high-granularity continuous representation. In an evaluation using the MVTec AD dataset, CRAD significantly outperforms the previous state-of-the-art method by reducing 65.0% of the error for multi-class unified anomaly detection. The project page is available at https://tae-mo.github.io/crad/. △ Less

Submitted 24 July, 2024; v1 submitted 28 February, 2024; originally announced February 2024.

Comments: Project page: https://tae-mo.github.io/crad/

arXiv:2402.17958 [pdf, other]

Spatial Distribution of Intracluster Light versus Dark Matter in Horizon Run 5

Authors: Jaewon Yoo, Changbom Park, Cristiano G. Sabiu, Ankit Singh, Jongwan Ko, Jaehyun Lee, Christophe Pichon, M. James Jee, Brad K. Gibson, Owain Snaith, Juhan Kim, Jihye Shin, Yonghwi Kim, Hyowon Kim

Abstract: One intriguing approach for studying the dynamical evolution of galaxy clusters is to compare the spatial distributions among various components, such as dark matter, member galaxies, gas, and intracluster light (ICL). Utilizing the recently introduced Weighted Overlap Coefficient (WOC) \citep{2022ApJS..261...28Y}, we analyze the spatial distributions of components within 174 galaxy clusters (… ▽ More One intriguing approach for studying the dynamical evolution of galaxy clusters is to compare the spatial distributions among various components, such as dark matter, member galaxies, gas, and intracluster light (ICL). Utilizing the recently introduced Weighted Overlap Coefficient (WOC) \citep{2022ApJS..261...28Y}, we analyze the spatial distributions of components within 174 galaxy clusters ($M_{\rm tot}> 5 \times 10^{13} M_{\odot}$, $z=0.625$) at varying dynamical states in the cosmological hydrodynamical simulation Horizon Run 5. We observe that the distributions of gas and the combination of ICL with the brightest cluster galaxy (BCG) closely resembles the dark matter distribution, particularly in more relaxed clusters, characterized by the half-mass epoch. The similarity in spatial distribution between dark matter and BCG+ICL mimics the changes in the dynamical state of clusters during a major merger. Notably, at redshifts $>$ 1, BCG+ICL traced dark matter more accurately than the gas. Additionally, we examined the one-dimensional radial profiles of each component, which show that the BCG+ICL is a sensitive component revealing the dynamical state of clusters. We propose a new method that can approximately recover the dark matter profile by scaling the BCG+ICL radial profile. Furthermore, we find a recipe for tracing dark matter in unrelaxed clusters by including the most massive satellite galaxies together with BCG+ICL distribution. Combining the BCG+ICL and the gas distribution enhances the dark matter tracing ability. Our results imply that the BCG+ICL distribution is an effective tracer for the dark matter distribution, and the similarity of spatial distribution may be a useful probe of the dynamical state of a cluster. △ Less

Submitted 27 February, 2024; originally announced February 2024.

Comments: 23 pages, 12 figures, accepted for publication in ApJ

arXiv:2402.17125 [pdf, other]

doi 10.1016/j.nima.2024.169489

Waveform Simulation for Scintillation Characteristics of NaI(Tl) Crystal

Authors: J. J. Choi, C. Ha, E. J. Jeon, K. W. Kim, S. K. Kim, Y. D. Kim, Y. J. Ko, B. C. Koh, H. S. Lee, S. H. Lee, S. M. Lee, B. J. Park, G. H. Yu

Abstract: The lowering of the energy threshold in the NaI detector is crucial not only for comprehensive validation of DAMA/LIBRA but also for exploring new possibilities in the search for low-mass dark matter and observing coherent elastic scattering between neutrino and nucleus. Alongside hardware enhancements, extensive efforts have focused on refining event selection to discern noise, achieved through p… ▽ More The lowering of the energy threshold in the NaI detector is crucial not only for comprehensive validation of DAMA/LIBRA but also for exploring new possibilities in the search for low-mass dark matter and observing coherent elastic scattering between neutrino and nucleus. Alongside hardware enhancements, extensive efforts have focused on refining event selection to discern noise, achieved through parameter development and the application of machine learning. Acquiring pure, unbiased datasets is crucial in this endeavor, for which a waveform simulation was developed. The simulation data were compared with the experimental data using several pulse shape discrimination parameters to test its performance in describing the experimental data. Additionally, we present the outcomes of multi-variable machine learning trained with simulation data as a scintillation signal sample. The distributions of outcomes for experimental and simulation data show a good agreement. As an application of the waveform simulation, we validate the trigger efficiency alongside estimations derived from the minimally biased measurement data. △ Less

Submitted 17 June, 2024; v1 submitted 26 February, 2024; originally announced February 2024.

Journal ref: NIM A 1065, 169489 (2024)

arXiv:2402.16506 [pdf, other]

Stochastic Conditional Diffusion Models for Robust Semantic Image Synthesis

Authors: Juyeon Ko, Inho Kong, Dogyun Park, Hyunwoo J. Kim

Abstract: Semantic image synthesis (SIS) is a task to generate realistic images corresponding to semantic maps (labels). However, in real-world applications, SIS often encounters noisy user inputs. To address this, we propose Stochastic Conditional Diffusion Model (SCDM), which is a robust conditional diffusion model that features novel forward and generation processes tailored for SIS with noisy labels. It… ▽ More Semantic image synthesis (SIS) is a task to generate realistic images corresponding to semantic maps (labels). However, in real-world applications, SIS often encounters noisy user inputs. To address this, we propose Stochastic Conditional Diffusion Model (SCDM), which is a robust conditional diffusion model that features novel forward and generation processes tailored for SIS with noisy labels. It enhances robustness by stochastically perturbing the semantic label maps through Label Diffusion, which diffuses the labels with discrete diffusion. Through the diffusion of labels, the noisy and clean semantic maps become similar as the timestep increases, eventually becoming identical at $t=T$. This facilitates the generation of an image close to a clean image, enabling robust generation. Furthermore, we propose a class-wise noise schedule to differentially diffuse the labels depending on the class. We demonstrate that the proposed method generates high-quality samples through extensive experiments and analyses on benchmark datasets, including a novel experimental setup simulating human errors during real-world applications. Code is available at https://github.com/mlvlab/SCDM. △ Less

Submitted 3 June, 2024; v1 submitted 26 February, 2024; originally announced February 2024.

Comments: ICML 2024

arXiv:2402.15566 [pdf]

Closing the AI generalization gap by adjusting for dermatology condition distribution differences across clinical settings

Authors: Rajeev V. Rikhye, Aaron Loh, Grace Eunhae Hong, Preeti Singh, Margaret Ann Smith, Vijaytha Muralidharan, Doris Wong, Rory Sayres, Michelle Phung, Nicolas Betancourt, Bradley Fong, Rachna Sahasrabudhe, Khoban Nasim, Alec Eschholz, Basil Mustafa, Jan Freyberg, Terry Spitz, Yossi Matias, Greg S. Corrado, Katherine Chou, Dale R. Webster, Peggy Bui, Yuan Liu, Yun Liu, Justin Ko , et al. (1 additional authors not shown)

Abstract: Recently, there has been great progress in the ability of artificial intelligence (AI) algorithms to classify dermatological conditions from clinical photographs. However, little is known about the robustness of these algorithms in real-world settings where several factors can lead to a loss of generalizability. Understanding and overcoming these limitations will permit the development of generali… ▽ More Recently, there has been great progress in the ability of artificial intelligence (AI) algorithms to classify dermatological conditions from clinical photographs. However, little is known about the robustness of these algorithms in real-world settings where several factors can lead to a loss of generalizability. Understanding and overcoming these limitations will permit the development of generalizable AI that can aid in the diagnosis of skin conditions across a variety of clinical settings. In this retrospective study, we demonstrate that differences in skin condition distribution, rather than in demographics or image capture mode are the main source of errors when an AI algorithm is evaluated on data from a previously unseen source. We demonstrate a series of steps to close this generalization gap, requiring progressively more information about the new source, ranging from the condition distribution to training data enriched for data less frequently seen during training. Our results also suggest comparable performance from end-to-end fine tuning versus fine tuning solely the classification layer on top of a frozen embedding model. Our approach can inform the adaptation of AI algorithms to new settings, based on the information and resources available. △ Less

Submitted 23 February, 2024; originally announced February 2024.

arXiv:2402.15122 [pdf, other]

Measurements of low-energy nuclear recoil quenching factors for Na and I recoils in the NaI(Tl) scintillator

Authors: S. H. Lee, H. W. Joo, H. J. Kim, K. W. Kim, S. K. Kim, Y. D. Kim, Y. J. Ko, H. S. Lee, J. Y. Lee, H. S. Park, Y. S. Yoon

Abstract: Elastic scattering off nuclei in target detectors, involving interactions with dark matter and coherent elastic neutrino nuclear recoil (CE$ν$NS), results in the deposition of low energy within the nuclei, dissipating rapidly through a combination of heat and ionization. The primary energy loss mechanism for nuclear recoil is heat, leading to consistently smaller measurable scintillation signals c… ▽ More Elastic scattering off nuclei in target detectors, involving interactions with dark matter and coherent elastic neutrino nuclear recoil (CE$ν$NS), results in the deposition of low energy within the nuclei, dissipating rapidly through a combination of heat and ionization. The primary energy loss mechanism for nuclear recoil is heat, leading to consistently smaller measurable scintillation signals compared to electron recoils of the same energy. The nuclear recoil quenching factor (QF), representing the ratio of scintillation light yield produced by nuclear recoil to that of electron recoil at the same energy, is a critical parameter for understanding dark matter and neutrino interactions with nuclei. The low energy QF of NaI(Tl) crystals, commonly employed in dark matter searches and CE$ν$NS measurements, is of substantial importance. Previous low energy QF measurements were constrained by contamination from photomultiplier tube (PMT)-induced noise, resulting in an observed light yield of approximately 15 photoelectrons per keVee (kilo-electron-volt electron-equivalent energy) and nuclear recoil energy above 5 keVnr (kilo-electron-volt nuclear recoil energy). Through enhanced crystal encapsulation, an increased light yield of around 26 photoelectrons per keVee is achieved. This improvement enables the measurement of the nuclear recoil QF for sodium nuclei at an energy of 3.8 $\pm$ 0.6 keVnr with a QF of 11.2 $\pm$ 1.7%. Furthermore, a reevaluation of previously reported QF results is conducted, incorporating enhancements in low energy events based on waveform simulation. The outcomes are generally consistent with various recent QF measurements for sodium and iodine. △ Less

Submitted 8 July, 2024; v1 submitted 23 February, 2024; originally announced February 2024.

arXiv:2402.14196 [pdf, other]

Mip-Grid: Anti-aliased Grid Representations for Neural Radiance Fields

Authors: Seungtae Nam, Daniel Rho, Jong Hwan Ko, Eunbyung Park

Abstract: Despite the remarkable achievements of neural radiance fields (NeRF) in representing 3D scenes and generating novel view images, the aliasing issue, rendering "jaggies" or "blurry" images at varying camera distances, remains unresolved in most existing approaches. The recently proposed mip-NeRF has addressed this challenge by rendering conical frustums instead of rays. However, it relies on MLP ar… ▽ More Despite the remarkable achievements of neural radiance fields (NeRF) in representing 3D scenes and generating novel view images, the aliasing issue, rendering "jaggies" or "blurry" images at varying camera distances, remains unresolved in most existing approaches. The recently proposed mip-NeRF has addressed this challenge by rendering conical frustums instead of rays. However, it relies on MLP architecture to represent the radiance fields, missing out on the fast training speed offered by the latest grid-based methods. In this work, we present mip-Grid, a novel approach that integrates anti-aliasing techniques into grid-based representations for radiance fields, mitigating the aliasing artifacts while enjoying fast training time. The proposed method generates multi-scale grids by applying simple convolution operations over a shared grid representation and uses the scale-aware coordinate to retrieve features at different scales from the generated multi-scale grids. To test the effectiveness, we integrated the proposed method into the two recent representative grid-based methods, TensoRF and K-Planes. Experimental results demonstrate that mip-Grid greatly improves the rendering performance of both methods and even outperforms mip-NeRF on multi-scale datasets while achieving significantly faster training time. For code and demo videos, please see https://stnamjef.github.io/mipgrid.github.io/. △ Less

Submitted 21 February, 2024; originally announced February 2024.

Comments: Accepted to NeurIPS 2023

arXiv:2402.03898 [pdf, other]

DistiLLM: Towards Streamlined Distillation for Large Language Models

Authors: Jongwoo Ko, Sungnyun Kim, Tianyi Chen, Se-Young Yun

Abstract: Knowledge distillation (KD) is widely used for compressing a teacher model to a smaller student model, reducing its inference cost and memory footprint while preserving model capabilities. However, current KD methods for auto-regressive sequence models (e.g., large language models) suffer from missing a standardized objective function. Moreover, the recent use of student-generated outputs to addre… ▽ More Knowledge distillation (KD) is widely used for compressing a teacher model to a smaller student model, reducing its inference cost and memory footprint while preserving model capabilities. However, current KD methods for auto-regressive sequence models (e.g., large language models) suffer from missing a standardized objective function. Moreover, the recent use of student-generated outputs to address training-inference mismatches has significantly escalated computational costs. To tackle these issues, we introduce DistiLLM, a more effective and efficient KD framework for auto-regressive language models. DistiLLM comprises two components: (1) a novel skew Kullback-Leibler divergence loss, where we unveil and leverage its theoretical properties, and (2) an adaptive off-policy approach designed to enhance the efficiency in utilizing student-generated outputs. Extensive experiments, including instruction-following tasks, demonstrate the effectiveness of DistiLLM in building high-performing student models while achieving up to 4.3$\times$ speedup compared to recent KD methods. △ Less

Submitted 3 July, 2024; v1 submitted 6 February, 2024; originally announced February 2024.

Comments: ICML 2024; Code is available at https://github.com/jongwooko/distillm

arXiv:2401.15894 [pdf, other]

Enhancing Topological Dependencies in Spatio-Temporal Graphs with Cycle Message Passing Blocks

Authors: Minho Lee, Yun Young Choi, Sun Woo Park, Seunghwan Lee, Joohwan Ko, Jaeyoung Hong

Abstract: Graph Neural Networks (GNNs) and Transformer-based models have been increasingly adopted to learn the complex vector representations of spatio-temporal graphs, capturing intricate spatio-temporal dependencies crucial for applications such as traffic datasets. Although many existing methods utilize multi-head attention mechanisms and message-passing neural networks (MPNNs) to capture both spatial a… ▽ More Graph Neural Networks (GNNs) and Transformer-based models have been increasingly adopted to learn the complex vector representations of spatio-temporal graphs, capturing intricate spatio-temporal dependencies crucial for applications such as traffic datasets. Although many existing methods utilize multi-head attention mechanisms and message-passing neural networks (MPNNs) to capture both spatial and temporal relations, these approaches encode temporal and spatial relations independently, and reflect the graph's topological characteristics in a limited manner. In this work, we introduce the Cycle to Mixer (Cy2Mixer), a novel spatio-temporal GNN based on topological non-trivial invariants of spatio-temporal graphs with gated multi-layer perceptrons (gMLP). The Cy2Mixer is composed of three blocks based on MLPs: A temporal block for capturing temporal properties, a message-passing block for encapsulating spatial information, and a cycle message-passing block for enriching topological information through cyclic subgraphs. We bolster the effectiveness of Cy2Mixer with mathematical evidence emphasizing that our cycle message-passing block is capable of offering differentiated information to the deep learning model compared to the message-passing block. Furthermore, empirical evaluations substantiate the efficacy of the Cy2Mixer, demonstrating state-of-the-art performances across various spatio-temporal benchmark datasets. The source code is available at \url{https://github.com/leemingo/cy2mixer}. △ Less

Submitted 5 December, 2024; v1 submitted 29 January, 2024; originally announced January 2024.

Comments: Proceedings of the Third Learning on Graphs Conference (LoG 2024)

arXiv:2401.13215 [pdf, other]

doi 10.1007/s40042-024-01120-9

νOscillation: a software package for computation and simulation of neutrino propagation and interaction

Authors: Seonghyeok Jang, Eunju Jeon, Eunil Won, Young Ju Ko, Kyungmin Lee

Abstract: The behavior of neutrinos is the only phenomenon that cannot be explained by the standard model of particle physics. Because of these mysterious neutrino interactions observed in nature, at present, there is growing interest in this field and ongoing or planned neutrino experiments are seeking solutions to this mystery very actively. The design of neutrino experiments and the analysis of neutrino… ▽ More The behavior of neutrinos is the only phenomenon that cannot be explained by the standard model of particle physics. Because of these mysterious neutrino interactions observed in nature, at present, there is growing interest in this field and ongoing or planned neutrino experiments are seeking solutions to this mystery very actively. The design of neutrino experiments and the analysis of neutrino data rely on precise computations of neutrino oscillations and scattering processes in general. Motivated by this, we developed a software package that calculates neutrino production and oscillation in nuclear reactors, neutrino-electron scattering of solar neutrinos, and the oscillation of neutrinos from radioactive isotopes for the search of sterile neutrinos. This software package is validated by reproducing the result of calculations and observations in other publications. We also demonstrate the feasibility of this package by calculating the sensitivity of a liquid scintillator detector, currently in planning, to the sterile neutrinos. This work is expected to be used in designs of future neutrino experiments. △ Less

Submitted 4 October, 2024; v1 submitted 23 January, 2024; originally announced January 2024.

Comments: 15 pages, 8 figures

Journal ref: J. Korean Phys. Soc. 85 (2024) 381-388

arXiv:2401.10989 [pdf, other]

Provably Scalable Black-Box Variational Inference with Structured Variational Families

Authors: Joohwan Ko, Kyurae Kim, Woo Chang Kim, Jacob R. Gardner

Abstract: Variational families with full-rank covariance approximations are known not to work well in black-box variational inference (BBVI), both empirically and theoretically. In fact, recent computational complexity results for BBVI have established that full-rank variational families scale poorly with the dimensionality of the problem compared to e.g. mean-field families. This is particularly critical t… ▽ More Variational families with full-rank covariance approximations are known not to work well in black-box variational inference (BBVI), both empirically and theoretically. In fact, recent computational complexity results for BBVI have established that full-rank variational families scale poorly with the dimensionality of the problem compared to e.g. mean-field families. This is particularly critical to hierarchical Bayesian models with local variables; their dimensionality increases with the size of the datasets. Consequently, one gets an iteration complexity with an explicit $\mathcal{O}(N^2)$ dependence on the dataset size $N$. In this paper, we explore a theoretical middle ground between mean-field variational families and full-rank families: structured variational families. We rigorously prove that certain scale matrix structures can achieve a better iteration complexity of $\mathcal{O}\left(N\right)$, implying better scaling with respect to $N$. We empirically verify our theoretical results on large-scale hierarchical models. △ Less

Submitted 30 November, 2024; v1 submitted 19 January, 2024; originally announced January 2024.

Comments: Accepted to ICML'24; v3: fixed typos

arXiv:2401.09986 [pdf, other]

Improving Local Training in Federated Learning via Temperature Scaling

Authors: Kichang Lee, Songkuk Kim, JeongGil Ko

Abstract: Federated learning is inherently hampered by data heterogeneity: non-i.i.d. training data over local clients. We propose a novel model training approach for federated learning, FLex&Chill, which exploits the Logit Chilling method. Through extensive evaluations, we demonstrate that, in the presence of non-i.i.d. data characteristics inherent in federated learning systems, this approach can expedite… ▽ More Federated learning is inherently hampered by data heterogeneity: non-i.i.d. training data over local clients. We propose a novel model training approach for federated learning, FLex&Chill, which exploits the Logit Chilling method. Through extensive evaluations, we demonstrate that, in the presence of non-i.i.d. data characteristics inherent in federated learning systems, this approach can expedite model convergence and improve inference accuracy. Quantitatively, from our experiments, we observe up to 6X improvement in the global federated learning model convergence time, and up to 3.37% improvement in inference accuracy. △ Less

Submitted 26 June, 2024; v1 submitted 18 January, 2024; originally announced January 2024.

Comments: 24 pages

MSC Class: 68 ACM Class: I.2.11

arXiv:2401.09678 [pdf, other]

Integrating Graceful Degradation and Recovery through Requirement-driven Adaptation

Authors: Simon Chu, Justin Koe, David Garlan, Eunsuk Kang

Abstract: Cyber-physical systems (CPS) are subject to environmental uncertainties such as adverse operating conditions, malicious attacks, and hardware degradation. These uncertainties may lead to failures that put the system in a sub-optimal or unsafe state. Systems that are resilient to such uncertainties rely on two types of operations: (1) graceful degradation, to ensure that the system maintains an acc… ▽ More Cyber-physical systems (CPS) are subject to environmental uncertainties such as adverse operating conditions, malicious attacks, and hardware degradation. These uncertainties may lead to failures that put the system in a sub-optimal or unsafe state. Systems that are resilient to such uncertainties rely on two types of operations: (1) graceful degradation, to ensure that the system maintains an acceptable level of safety during unexpected environmental conditions and (2) recovery, to facilitate the resumption of normal system functions. Typically, mechanisms for degradation and recovery are developed independently from each other, and later integrated into a system, requiring the designer to develop an additional, ad-hoc logic for activating and coordinating between the two operations. In this paper, we propose a self-adaptation approach for improving system resiliency through automated triggering and coordination of graceful degradation and recovery. The key idea behind our approach is to treat degradation and recovery as requirement-driven adaptation tasks: Degradation can be thought of as temporarily weakening original (i.e., ideal) system requirements to be achieved by the system, and recovery as strengthening the weakened requirements when the environment returns within an expected operating boundary. Furthermore, by treating weakening and strengthening as dual operations, we argue that a single requirement-based adaptation method is sufficient to enable coordination between degradation and recovery. Given system requirements specified in signal temporal logic (STL), we propose a run-time adaptation framework that performs degradation and recovery in response to environmental changes. We describe a prototype implementation of our framework and demonstrate the feasibility of the proposed approach using a case study in unmanned underwater vehicles. △ Less

Submitted 8 April, 2024; v1 submitted 17 January, 2024; originally announced January 2024.

Comments: Pre-print for the SEAMS '24 conference (Software Engineering for Adaptive and Self-Managing Systems Conference)

arXiv:2401.07476 [pdf, other]

Background study of the AMoRE-pilot experiment

Authors: A. Agrawal, V. V. Alenkov, P. Aryal, J. Beyer, B. Bhandari, R. S. Boiko, K. Boonin, O. Buzanov, C. R. Byeon, N. Chanthima, M. K. Cheoun, J. S. Choe, Seonho Choi, S. Choudhury, J. S. Chung, F. A. Danevich, M. Djamal, D. Drung, C. Enss, A. Fleischmann, A. M. Gangapshev, L. Gastaldo, Yu. M. Gavrilyuk, A. M. Gezhaev, O. Gileva , et al. (83 additional authors not shown)

Abstract: We report a study on the background of the Advanced Molybdenum-Based Rare process Experiment (AMoRE), a search for neutrinoless double beta decay (\znbb) of $^{100}$Mo. The pilot stage of the experiment was conducted using $\sim$1.9 kg of \CAMOO~ crystals at the Yangyang Underground Laboratory, South Korea, from 2015 to 2018. We compared the measured $β/γ$ energy spectra in three experimental conf… ▽ More We report a study on the background of the Advanced Molybdenum-Based Rare process Experiment (AMoRE), a search for neutrinoless double beta decay (\znbb) of $^{100}$Mo. The pilot stage of the experiment was conducted using $\sim$1.9 kg of \CAMOO~ crystals at the Yangyang Underground Laboratory, South Korea, from 2015 to 2018. We compared the measured $β/γ$ energy spectra in three experimental configurations with the results of Monte Carlo simulations and identified the background sources in each configuration. We replaced several detector components and enhanced the neutron shielding to lower the background level between configurations. A limit on the half-life of $0νββ$ decay of $^{100}$Mo was found at $T_{1/2}^{0ν} \ge 3.0\times 10^{23}$ years at 90\% confidence level, based on the measured background and its modeling. Further reduction of the background rate in the AMoRE-I and AMoRE-II are discussed. △ Less

Submitted 7 April, 2024; v1 submitted 15 January, 2024; originally announced January 2024.

arXiv:2401.07462 [pdf, other]

doi 10.1140/epjc/s10052-024-12770-1

Nonproportionality of NaI(Tl) Scintillation Detector for Dark Matter Search Experiments

Authors: S. M. Lee, G. Adhikari, N. Carlin, J. Y. Cho, J. J. Choi, S. Choi, A. C. Ezeribe, L. E. Fran. a, C. Ha, I. S. Hahn, S. J. Hollick, E. J. Jeon, H. W. Joo, W. G. Kang, M. Kauer, B. H. Kim, H. J. Kim, J. Kim, K. W. Kim, S. H. Kim, S. K. Kim, S. W. Kim, W. K. Kim, Y. D. Kim, Y. H. Kim , et al. (37 additional authors not shown)

Abstract: We present a comprehensive study of the nonproportionality of NaI(Tl) scintillation detectors within the context of dark matter search experiments. Our investigation, which integrates COSINE-100 data with supplementary $γ$ spectroscopy, measures light yields across diverse energy levels from full-energy $γ$ peaks produced by the decays of various isotopes. These $γ$ peaks of interest were produced… ▽ More We present a comprehensive study of the nonproportionality of NaI(Tl) scintillation detectors within the context of dark matter search experiments. Our investigation, which integrates COSINE-100 data with supplementary $γ$ spectroscopy, measures light yields across diverse energy levels from full-energy $γ$ peaks produced by the decays of various isotopes. These $γ$ peaks of interest were produced by decays supported by both long and short-lived isotopes. Analyzing peaks from decays supported only by short-lived isotopes presented a unique challenge due to their limited statistics and overlapping energies, which was overcome by long-term data collection and a time-dependent analysis. A key achievement is the direct measurement of the 0.87 keV light yield, resulting from the cascade following electron capture decay of $^{22}$Na from internal contamination. This measurement, previously accessible only indirectly, deepens our understanding of NaI(Tl) scintillator behavior in the region of interest for dark matter searches. This study holds substantial implications for background modeling and the interpretation of dark matter signals in NaI(Tl) experiments. △ Less

Submitted 10 May, 2024; v1 submitted 14 January, 2024; originally announced January 2024.

Comments: 12 pages, 7 figures

Journal ref: Eur. Phys. J. C 84 (2024) 484

arXiv:2401.04669 [pdf, other]

doi 10.1145/3577193.3593712

Transfer-Learning-Based Autotuning Using Gaussian Copula

Authors: Thomas Randall, Jaehoon Koo, Brice Videau, Michael Kruse, Xingfu Wu, Paul Hovland, Mary Hall, Rong Ge, Prasanna Balaprakash

Abstract: As diverse high-performance computing (HPC) systems are built, many opportunities arise for applications to solve larger problems than ever before. Given the significantly increased complexity of these HPC systems and application tuning, empirical performance tuning, such as autotuning, has emerged as a promising approach in recent years. Despite its effectiveness, autotuning is often a computatio… ▽ More As diverse high-performance computing (HPC) systems are built, many opportunities arise for applications to solve larger problems than ever before. Given the significantly increased complexity of these HPC systems and application tuning, empirical performance tuning, such as autotuning, has emerged as a promising approach in recent years. Despite its effectiveness, autotuning is often a computationally expensive approach. Transfer learning (TL)-based autotuning seeks to address this issue by leveraging the data from prior tuning. Current TL methods for autotuning spend significant time modeling the relationship between parameter configurations and performance, which is ineffective for few-shot (that is, few empirical evaluations) tuning on new tasks. We introduce the first generative TL-based autotuning approach based on the Gaussian copula (GC) to model the high-performing regions of the search space from prior data and then generate high-performing configurations for new tasks. This allows a sampling-based approach that maximizes few-shot performance and provides the first probabilistic estimation of the few-shot budget for effective TL-based autotuning. We compare our generative TL approach with state-of-the-art autotuning techniques on several benchmarks. We find that the GC is capable of achieving 64.37% of peak few-shot performance in its first evaluation. Furthermore, the GC model can determine a few-shot transfer budget that yields up to 33.39$\times$ speedup, a dramatic improvement over the 20.58$\times$ speedup using prior techniques. △ Less

Submitted 9 January, 2024; originally announced January 2024.

Comments: 13 pages, 5 figures, 7 tables, the definitive version of this work is published in the Proceedings of the ACM International Conference on Supercomputing 2023, available at https://dl.acm.org/doi/10.1145/3577193.3593712

ACM Class: I.2.4; G.3; D.2.8

Journal ref: Proceedings of the 37th International Conference on Supercomputing (2023) 37-49

arXiv:2401.03650 [pdf, other]

DDD: A Perceptually Superior Low-Response-Time DNN-based Declipper

Authors: Jayeon Yi, Junghyun Koo, Kyogu Lee

Abstract: Clipping is a common nonlinear distortion that occurs whenever the input or output of an audio system exceeds the supported range. This phenomenon undermines not only the perception of speech quality but also downstream processes utilizing the disrupted signal. Therefore, a real-time-capable, robust, and low-response-time method for speech declipping (SD) is desired. In this work, we introduce DDD… ▽ More Clipping is a common nonlinear distortion that occurs whenever the input or output of an audio system exceeds the supported range. This phenomenon undermines not only the perception of speech quality but also downstream processes utilizing the disrupted signal. Therefore, a real-time-capable, robust, and low-response-time method for speech declipping (SD) is desired. In this work, we introduce DDD (Demucs-Discriminator-Declipper), a real-time-capable speech-declipping deep neural network (DNN) that requires less response time by design. We first observe that a previously untested real-time-capable DNN model, Demucs, exhibits a reasonable declipping performance. Then we utilize adversarial learning objectives to increase the perceptual quality of output speech without additional inference overhead. Subjective evaluations on harshly clipped speech shows that DDD outperforms the baselines by a wide margin in terms of speech quality. We perform detailed waveform and spectral analyses to gain an insight into the output behavior of DDD in comparison to the baselines. Finally, our streaming simulations also show that DDD is capable of sub-decisecond mean response times, outperforming the state-of-the-art DNN approach by a factor of six. △ Less

Submitted 7 January, 2024; originally announced January 2024.

Comments: To appear, ICASSP 2024. Demo samples at https://stet-stet.github.io/DDD, repo at https://github.com/stet-stet/DDD

arXiv:2312.17014 [pdf, ps, other]

doi 10.1007/JHEP03(2024)072

Universality on thermodynamic relation with corrections in de Sitter black holes

Authors: Junbeom Ko, Bogeun Gwak

Abstract: We herein investigate the universal relation proposed by Goon and Penco in de Sitter black holes with electric charge or angular momentum. Our analysis focuses on the cosmological horizon, which only exists in de Sitter and Nariai spacetimes. Because the relation is given in a general case, the overall relationship may be valid. However, we elucidate the details of the relation, highlighting disti… ▽ More We herein investigate the universal relation proposed by Goon and Penco in de Sitter black holes with electric charge or angular momentum. Our analysis focuses on the cosmological horizon, which only exists in de Sitter and Nariai spacetimes. Because the relation is given in a general case, the overall relationship may be valid. However, we elucidate the details of the relation, highlighting distinctions from those of (anti-)de Sitter black holes while affirming the validity of the relation. Furthermore, based on our analysis of Schwarzschild--de Sitter, Reissner--Nordström--de Sitter, and Kerr--de Sitter black holes, we demonstrate the universality of the thermodynamic relation in de Sitter black holes. △ Less

Submitted 14 March, 2024; v1 submitted 28 December, 2023; originally announced December 2023.

Comments: 19 pages, published in JHEP

arXiv:2312.08847 [pdf, other]

Knowledge-Driven Modulation of Neural Networks with Attention Mechanism for Next Activity Prediction

Authors: Ivan Donadello, Jonghyeon Ko, Fabrizio Maria Maggi, Jan Mendling, Francesco Riva, Matthias Weidlich

Abstract: Predictive Process Monitoring (PPM) aims at leveraging historic process execution data to predict how ongoing executions will continue up to their completion. In recent years, PPM techniques for the prediction of the next activities have matured significantly, mainly thanks to the use of Neural Networks (NNs) as a predictor. While their performance is difficult to beat in the general case, there a… ▽ More Predictive Process Monitoring (PPM) aims at leveraging historic process execution data to predict how ongoing executions will continue up to their completion. In recent years, PPM techniques for the prediction of the next activities have matured significantly, mainly thanks to the use of Neural Networks (NNs) as a predictor. While their performance is difficult to beat in the general case, there are specific situations where background process knowledge can be helpful. Such knowledge can be leveraged for improving the quality of predictions for exceptional process executions or when the process changes due to a concept drift. In this paper, we present a Symbolic[Neuro] system that leverages background knowledge expressed in terms of a procedural process model to offset the under-sampling in the training data. More specifically, we make predictions using NNs with attention mechanism, an emerging technology in the NN field. The system has been tested on several real-life logs showing an improvement in the performance of the prediction task. △ Less

Submitted 14 December, 2023; originally announced December 2023.

MSC Class: 68T20 (Primary) 68T01; 68T05; 68T37 (Secondary) ACM Class: I.2.6; I.2.8; I.2.m

arXiv:2312.07957 [pdf, other]

Scintillation characteristics of an undoped CsI crystal at low-temperature for dark matter search

Authors: W. K. Kim, H. Y. Lee, K. W. Kim, Y. J. Ko, J. A. Jeon, H. J. Kim, H. S. Lee

Abstract: The scintillation characteristics of 1 g undoped CsI crystal were studied by directly coupling two silicon photomultipliers (SiPMs) over a temperature range from room temperature to 86 K. The scintillation decay time and light output were measured using x-ray and gamma-ray peaks from a $^{109}$Cd radioactive source. An increase in decay time was observed as the temperature decreased from room temp… ▽ More The scintillation characteristics of 1 g undoped CsI crystal were studied by directly coupling two silicon photomultipliers (SiPMs) over a temperature range from room temperature to 86 K. The scintillation decay time and light output were measured using x-ray and gamma-ray peaks from a $^{109}$Cd radioactive source. An increase in decay time was observed as the temperature decreased from room temperature to 86 K, ranging from 76 ns to 605 ns. The light output increased as well, reaching 37.9 $\pm$ 1.5 photoelectrons per keV electron-equivalent at 86 K, which is approximately 18 times higher than the light yield at room temperature. Leveraging the significantly enhanced scintillation light output of the undoped CsI crystal at low temperature, coupling it with SiPMs results into a promising detector for dark matter search. Both cesium and iodine have a proton odd number, thus they are suitable targets to probe dark matter-proton spin dependent interactions. We evaluated the sensitivity of the detector here proposed to light dark matter-proton spin dependent interactions. We included the Migdal effect and assumed 200\,kg of undoped CsI crystals for the dark matter search. We conclude that undoped CsI coupled to SiPM can exhibit world-competitive sensitivities for low-mass dark matter detection, particularly for the dark matter-proton spin-dependent interaction. △ Less

Submitted 15 July, 2024; v1 submitted 13 December, 2023; originally announced December 2023.

arXiv:2311.15569 [pdf, other]

Towards Difficulty-Agnostic Efficient Transfer Learning for Vision-Language Models

Authors: Yongjin Yang, Jongwoo Ko, Se-Young Yun

Abstract: Vision-language models (VLMs) like CLIP have demonstrated remarkable applicability across a variety of downstream tasks, including zero-shot image classification. Recently, the use of prompts or adapters for efficient transfer learning (ETL) has gained significant attention for effectively adapting to downstream tasks. However, previous studies have overlooked the challenge of varying transfer dif… ▽ More Vision-language models (VLMs) like CLIP have demonstrated remarkable applicability across a variety of downstream tasks, including zero-shot image classification. Recently, the use of prompts or adapters for efficient transfer learning (ETL) has gained significant attention for effectively adapting to downstream tasks. However, previous studies have overlooked the challenge of varying transfer difficulty of downstream tasks. In this paper, we empirically analyze how each ETL method behaves with respect to transfer difficulty. Our observations indicate that utilizing vision prompts and text adapters is crucial for adaptability and generalizability in domains with high difficulty. Also, by applying an adaptive ensemble approach that integrates task-adapted VLMs with pre-trained VLMs and strategically leverages more general knowledge in low-difficulty and less in high-difficulty domains, we consistently enhance performance across both types of domains. Based on these observations, we propose an adaptive ensemble method that combines visual prompts and text adapters with pre-trained VLMs, tailored by transfer difficulty, to achieve optimal performance for any target domain. Upon experimenting with extensive benchmarks, our method consistently outperforms all baselines, particularly on unseen tasks, demonstrating its effectiveness. △ Less

Submitted 11 October, 2024; v1 submitted 27 November, 2023; originally announced November 2023.

Comments: EMNLP 2024; code available at: https://github.com/YangYongJin/APEX

arXiv:2311.14993 [pdf, other]

Coordinate-Aware Modulation for Neural Fields

Authors: Joo Chan Lee, Daniel Rho, Seungtae Nam, Jong Hwan Ko, Eunbyung Park

Abstract: Neural fields, mapping low-dimensional input coordinates to corresponding signals, have shown promising results in representing various signals. Numerous methodologies have been proposed, and techniques employing MLPs and grid representations have achieved substantial success. MLPs allow compact and high expressibility, yet often suffer from spectral bias and slow convergence speed. On the other h… ▽ More Neural fields, mapping low-dimensional input coordinates to corresponding signals, have shown promising results in representing various signals. Numerous methodologies have been proposed, and techniques employing MLPs and grid representations have achieved substantial success. MLPs allow compact and high expressibility, yet often suffer from spectral bias and slow convergence speed. On the other hand, methods using grids are free from spectral bias and achieve fast training speed, however, at the expense of high spatial complexity. In this work, we propose a novel way for exploiting both MLPs and grid representations in neural fields. Unlike the prevalent methods that combine them sequentially (extract features from the grids first and feed them to the MLP), we inject spectral bias-free grid representations into the intermediate features in the MLP. More specifically, we suggest a Coordinate-Aware Modulation (CAM), which modulates the intermediate features using scale and shift parameters extracted from the grid representations. This can maintain the strengths of MLPs while mitigating any remaining potential biases, facilitating the rapid learning of high-frequency components. In addition, we empirically found that the feature normalizations, which have not been successful in neural filed literature, proved to be effective when applied in conjunction with the proposed CAM. Experimental results demonstrate that CAM enhances the performance of neural representation and improves learning stability across a range of signals. Especially in the novel view synthesis task, we achieved state-of-the-art performance with the least number of parameters and fast training speed for dynamic scenes and the best performance under 1MB memory for static scenes. CAM also outperforms the best-performing video compression methods using neural fields by a large margin. △ Less

Submitted 25 November, 2023; originally announced November 2023.

Comments: Project page: http://maincold2.github.io/cam/

arXiv:2311.13831 [pdf, other]

Posterior Distillation Sampling

Authors: Juil Koo, Chanho Park, Minhyuk Sung

Abstract: We introduce Posterior Distillation Sampling (PDS), a novel optimization method for parametric image editing based on diffusion models. Existing optimization-based methods, which leverage the powerful 2D prior of diffusion models to handle various parametric images, have mainly focused on generation. Unlike generation, editing requires a balance between conforming to the target attribute and prese… ▽ More We introduce Posterior Distillation Sampling (PDS), a novel optimization method for parametric image editing based on diffusion models. Existing optimization-based methods, which leverage the powerful 2D prior of diffusion models to handle various parametric images, have mainly focused on generation. Unlike generation, editing requires a balance between conforming to the target attribute and preserving the identity of the source content. Recent 2D image editing methods have achieved this balance by leveraging the stochastic latent encoded in the generative process of diffusion models. To extend the editing capabilities of diffusion models shown in pixel space to parameter space, we reformulate the 2D image editing method into an optimization form named PDS. PDS matches the stochastic latents of the source and the target, enabling the sampling of targets in diverse parameter spaces that align with a desired attribute while maintaining the source's identity. We demonstrate that this optimization resembles running a generative process with the target attribute, but aligning this process with the trajectory of the source's generative process. Extensive editing results in Neural Radiance Fields and Scalable Vector Graphics representations demonstrate that PDS is capable of sampling targets to fulfill the aforementioned balance across various parameter spaces. △ Less

Submitted 31 March, 2024; v1 submitted 23 November, 2023; originally announced November 2023.

Comments: Project page: https://posterior-distillation-sampling.github.io/

arXiv:2311.13681 [pdf, other]

Compact 3D Gaussian Representation for Radiance Field

Authors: Joo Chan Lee, Daniel Rho, Xiangyu Sun, Jong Hwan Ko, Eunbyung Park

Abstract: Neural Radiance Fields (NeRFs) have demonstrated remarkable potential in capturing complex 3D scenes with high fidelity. However, one persistent challenge that hinders the widespread adoption of NeRFs is the computational bottleneck due to the volumetric rendering. On the other hand, 3D Gaussian splatting (3DGS) has recently emerged as an alternative representation that leverages a 3D Gaussisan-ba… ▽ More Neural Radiance Fields (NeRFs) have demonstrated remarkable potential in capturing complex 3D scenes with high fidelity. However, one persistent challenge that hinders the widespread adoption of NeRFs is the computational bottleneck due to the volumetric rendering. On the other hand, 3D Gaussian splatting (3DGS) has recently emerged as an alternative representation that leverages a 3D Gaussisan-based representation and adopts the rasterization pipeline to render the images rather than volumetric rendering, achieving very fast rendering speed and promising image quality. However, a significant drawback arises as 3DGS entails a substantial number of 3D Gaussians to maintain the high fidelity of the rendered images, which requires a large amount of memory and storage. To address this critical issue, we place a specific emphasis on two key objectives: reducing the number of Gaussian points without sacrificing performance and compressing the Gaussian attributes, such as view-dependent color and covariance. To this end, we propose a learnable mask strategy that significantly reduces the number of Gaussians while preserving high performance. In addition, we propose a compact but effective representation of view-dependent color by employing a grid-based neural field rather than relying on spherical harmonics. Finally, we learn codebooks to compactly represent the geometric attributes of Gaussian by vector quantization. With model compression techniques such as quantization and entropy coding, we consistently show over 25$\times$ reduced storage and enhanced rendering speed, while maintaining the quality of the scene representation, compared to 3DGS. Our work provides a comprehensive framework for 3D scene representation, achieving high performance, fast training, compactness, and real-time rendering. Our project page is available at https://maincold2.github.io/c3dgs/. △ Less

Submitted 15 February, 2024; v1 submitted 22 November, 2023; originally announced November 2023.

Comments: Project page: http://maincold2.github.io/c3dgs/

arXiv:2311.09585 [pdf, other]

LifeTox: Unveiling Implicit Toxicity in Life Advice

Authors: Minbeom Kim, Jahyun Koo, Hwanhee Lee, Joonsuk Park, Hwaran Lee, Kyomin Jung

Abstract: As large language models become increasingly integrated into daily life, detecting implicit toxicity across diverse contexts is crucial. To this end, we introduce LifeTox, a dataset designed for identifying implicit toxicity within a broad range of advice-seeking scenarios. Unlike existing safety datasets, LifeTox comprises diverse contexts derived from personal experiences through open-ended ques… ▽ More As large language models become increasingly integrated into daily life, detecting implicit toxicity across diverse contexts is crucial. To this end, we introduce LifeTox, a dataset designed for identifying implicit toxicity within a broad range of advice-seeking scenarios. Unlike existing safety datasets, LifeTox comprises diverse contexts derived from personal experiences through open-ended questions. Experiments demonstrate that RoBERTa fine-tuned on LifeTox matches or surpasses the zero-shot performance of large language models in toxicity classification tasks. These results underscore the efficacy of LifeTox in addressing the complex challenges inherent in implicit toxicity. We open-sourced the dataset\footnote{\url{https://huggingface.co/datasets/mbkim/LifeTox}} and the LifeTox moderator family; 350M, 7B, and 13B. △ Less

Submitted 18 March, 2024; v1 submitted 16 November, 2023; originally announced November 2023.

Comments: 11 pages, 5 figures, NAACL 2024

arXiv:2311.07837 [pdf, ps, other]

Gauss's form class groups and Shimura's canonical models

Authors: Ja Kyung Koo, Dong Hwa Shin, Dong Sung Yoon

Abstract: Let $N$ be a positive integer and $Γ$ be a subgroup of $\mathrm{SL}_2(\mathbb{Z})$ containing $Γ_1(N)$. Let $K$ be an imaginary quadratic field and $\mathcal{O}$ be an order of discriminant $D_\mathcal{O}$ in $K$. Under some assumptions, we show that $Γ$ induces a form class group of discriminant $D_\mathcal{O}$ (or of order $\mathcal{O}$) and level $N$ if and only if there is a certain canonical… ▽ More Let $N$ be a positive integer and $Γ$ be a subgroup of $\mathrm{SL}_2(\mathbb{Z})$ containing $Γ_1(N)$. Let $K$ be an imaginary quadratic field and $\mathcal{O}$ be an order of discriminant $D_\mathcal{O}$ in $K$. Under some assumptions, we show that $Γ$ induces a form class group of discriminant $D_\mathcal{O}$ (or of order $\mathcal{O}$) and level $N$ if and only if there is a certain canonical model of the modular curve for $Γ$ defined over a suitably small number field. In this way we can find an interesting link between two different subjects, which will be useful in the study of certain quadratic Diophantine equations in terms of primes $p$. △ Less

Submitted 7 March, 2024; v1 submitted 13 November, 2023; originally announced November 2023.

Comments: 18 pages, The title has been changed

MSC Class: Primary 11R37; Secondary 11E12; 11R65

arXiv:2311.07607 [pdf, other]

Modeling Choice via Self-Attention

Authors: Joohwan Ko, Andrew A. Li

Abstract: Models of choice are a fundamental input to many now-canonical optimization problems in the field of Operations Management, including assortment, inventory, and price optimization. Naturally, accurate estimation of these models from data is a critical step in the application of these optimization problems in practice. Concurrently, recent advancements in deep learning have sparked interest in inte… ▽ More Models of choice are a fundamental input to many now-canonical optimization problems in the field of Operations Management, including assortment, inventory, and price optimization. Naturally, accurate estimation of these models from data is a critical step in the application of these optimization problems in practice. Concurrently, recent advancements in deep learning have sparked interest in integrating these techniques into choice modeling. However, there is a noticeable research gap at the intersection of deep learning and choice modeling, particularly with both theoretical and empirical foundations. Thus motivated, we first propose a choice model that is the first to successfully (both theoretically and practically) leverage a modern neural network architectural concept (self-attention). Theoretically, we show that our attention-based choice model is a low-rank generalization of the Halo Multinomial Logit (Halo-MNL) model. We prove that whereas the Halo-MNL requires $Ω(m^2)$ data samples to estimate, where $m$ is the number of products, our model supports a natural nonconvex estimator (in particular, that which a standard neural network implementation would apply) which admits a near-optimal stationary point with $O(m)$ samples. Additionally, we establish the first realistic-scale benchmark for choice model estimation on real data, conducting the most extensive evaluation of existing models to date, thereby highlighting our model's superior performance. △ Less

Submitted 8 February, 2024; v1 submitted 11 November, 2023; originally announced November 2023.

arXiv:2311.06320 [pdf, other]

Pursuing equitable access to vaccines for the next epidemic

Authors: Hsin-Ju Chou, Jing-Yuan Ko, Sung-Po Chao

Abstract: To mitigate the pandemic stemming from COVID-19, numerous nations have initiated extensive vaccination campaigns for their citizens since late 2020. While affluent countries have predominantly received vaccine allocations, fewer doses have been dispatched to nations with lower average incomes. This unequal distribution not only widens the disparity between wealthy and impoverished regions but also… ▽ More To mitigate the pandemic stemming from COVID-19, numerous nations have initiated extensive vaccination campaigns for their citizens since late 2020. While affluent countries have predominantly received vaccine allocations, fewer doses have been dispatched to nations with lower average incomes. This unequal distribution not only widens the disparity between wealthy and impoverished regions but also prolongs the pandemic, evident in the emergence of new viral variants. Our research delves into the correlation between the duration of the pandemic and the timing of vaccine distribution between two countries with migratory ties. By using a pair of coupled Susceptible- Infected-Recovered-Deceased (SIRD) models incorporating vaccination data, we demonstrate that timely sharing of vaccines benefits both nations, regardless of the presence of viral variants. This underscores that in the realm of vaccine distribution, self-interest and altruism are not mutually exclusive. △ Less

Submitted 28 March, 2024; v1 submitted 10 November, 2023; originally announced November 2023.

Comments: v.2: 14 pages, 7 sets of figures (English rephrasing, figures reploting, refrerence added)

arXiv:2311.05010 [pdf, other]

doi 10.1016/j.astropartphys.2024.102945

Alpha backgrounds in NaI(Tl) crystals of COSINE-100

Authors: G. Adhikari, N. Carlin, D. F. F. S. Cavalcante, J. Y. Cho, J. J. Choi, S. Choi, A. C. Ezeribe, L. E. Franca, C. Ha, I. S. Hahn, S. J. Hollick, E. J. Jeon, H. W. Joo, W. G. Kang, M. Kauer, B. H. Kim, H. J. Kim, J. Kim, K. W. Kim, S. H. Kim, S. K. Kim, S. W. Kim, W. K. Kim, Y. D. Kim, Y. H. Kim , et al. (38 additional authors not shown)

Abstract: COSINE-100 is a dark matter direct detection experiment with 106 kg NaI(Tl) as the target material. 210Pb and daughter isotopes are a dominant background in the WIMP region of interest and are detected via beta decay and alpha decay. Analysis of the alpha channel complements the background model as observed in the beta/gamma channel. We present the measurement of the quenching factors and Monte Ca… ▽ More COSINE-100 is a dark matter direct detection experiment with 106 kg NaI(Tl) as the target material. 210Pb and daughter isotopes are a dominant background in the WIMP region of interest and are detected via beta decay and alpha decay. Analysis of the alpha channel complements the background model as observed in the beta/gamma channel. We present the measurement of the quenching factors and Monte Carlo simulation results and activity quantification of the alpha decay components of the COSINE-100 NaI(Tl) crystals. The data strongly indicate that the alpha decays probabilistically undergo two possible quenching factors but require further investigation. The fitted results are consistent with independent measurements and improve the overall understanding of the COSINE-100 backgrounds. Furthermore, the half-life of 216Po has been measured to be 143.4 +/- 1.2 ms, which is consistent with and more precise than recent measurements. △ Less

Submitted 30 January, 2024; v1 submitted 8 November, 2023; originally announced November 2023.

arXiv:2310.20258 [pdf, other]

Advancing Bayesian Optimization via Learning Correlated Latent Space

Authors: Seunghun Lee, Jaewon Chu, Sihyeon Kim, Juyeon Ko, Hyunwoo J. Kim

Abstract: Bayesian optimization is a powerful method for optimizing black-box functions with limited function evaluations. Recent works have shown that optimization in a latent space through deep generative models such as variational autoencoders leads to effective and efficient Bayesian optimization for structured or discrete data. However, as the optimization does not take place in the input space, it lea… ▽ More Bayesian optimization is a powerful method for optimizing black-box functions with limited function evaluations. Recent works have shown that optimization in a latent space through deep generative models such as variational autoencoders leads to effective and efficient Bayesian optimization for structured or discrete data. However, as the optimization does not take place in the input space, it leads to an inherent gap that results in potentially suboptimal solutions. To alleviate the discrepancy, we propose Correlated latent space Bayesian Optimization (CoBO), which focuses on learning correlated latent spaces characterized by a strong correlation between the distances in the latent space and the distances within the objective function. Specifically, our method introduces Lipschitz regularization, loss weighting, and trust region recoordination to minimize the inherent gap around the promising areas. We demonstrate the effectiveness of our approach on several optimization tasks in discrete data, such as molecule design and arithmetic expression fitting, and achieve high performance within a small budget. △ Less

Submitted 19 November, 2023; v1 submitted 31 October, 2023; originally announced October 2023.

arXiv:2310.17668 [pdf, other]

Fine tuning Pre trained Models for Robustness Under Noisy Labels

Authors: Sumyeong Ahn, Sihyeon Kim, Jongwoo Ko, Se-Young Yun

Abstract: The presence of noisy labels in a training dataset can significantly impact the performance of machine learning models. To tackle this issue, researchers have explored methods for Learning with Noisy Labels to identify clean samples and reduce the influence of noisy labels. However, constraining the influence of a certain portion of the training dataset can result in a reduction in overall general… ▽ More The presence of noisy labels in a training dataset can significantly impact the performance of machine learning models. To tackle this issue, researchers have explored methods for Learning with Noisy Labels to identify clean samples and reduce the influence of noisy labels. However, constraining the influence of a certain portion of the training dataset can result in a reduction in overall generalization performance. To alleviate this, recent studies have considered the careful utilization of noisy labels by leveraging huge computational resources. Therefore, the increasing training cost necessitates a reevaluation of efficiency. In other areas of research, there has been a focus on developing fine-tuning techniques for large pre-trained models that aim to achieve both high generalization performance and efficiency. However, these methods have mainly concentrated on clean datasets, and there has been limited exploration of the noisy label scenario. In this research, our aim is to find an appropriate way to fine-tune pre-trained models for noisy labeled datasets. To achieve this goal, we investigate the characteristics of pre-trained models when they encounter noisy datasets. Through empirical analysis, we introduce a novel algorithm called TURN, which robustly and efficiently transfers the prior knowledge of pre-trained models. The algorithm consists of two main steps: (1) independently tuning the linear classifier to protect the feature extractor from being distorted by noisy labels, and (2) reducing the noisy label ratio and fine-tuning the entire model based on the noise-reduced dataset to adapt it to the target dataset. The proposed algorithm has been extensively tested and demonstrates efficient yet improved denoising performance on various benchmarks compared to previous methods. △ Less

Submitted 24 October, 2023; originally announced October 2023.

Comments: 10 pages (17 pages including supplementary)

MSC Class: Computer Science; Artificial Intelligence

arXiv:2310.15668 [pdf, other]

Hypergraph Motifs and Their Extensions Beyond Binary

Authors: Geon Lee, Seokbum Yoon, Jihoon Ko, Hyunju Kim, Kijung Shin

Abstract: Hypergraphs naturally represent group interactions, which are omnipresent in many domains: collaborations of researchers, co-purchases of items, and joint interactions of proteins, to name a few. In this work, we propose tools for answering the following questions: (Q1) what are the structural design principles of real-world hypergraphs? (Q2) how can we compare local structures of hypergraphs of d… ▽ More Hypergraphs naturally represent group interactions, which are omnipresent in many domains: collaborations of researchers, co-purchases of items, and joint interactions of proteins, to name a few. In this work, we propose tools for answering the following questions: (Q1) what are the structural design principles of real-world hypergraphs? (Q2) how can we compare local structures of hypergraphs of different sizes? (Q3) how can we identify domains from which hypergraphs are? We first define hypergraph motifs (h-motifs), which describe the overlapping patterns of three connected hyperedges. Then, we define the significance of each h-motif in a hypergraph as its occurrences relative to those in properly randomized hypergraphs. Lastly, we define the characteristic profile (CP) as the vector of the normalized significance of every h-motif. Regarding Q1, we find that h-motifs' occurrences in 11 real-world hypergraphs from 5 domains are clearly distinguished from those of randomized hypergraphs. Then, we demonstrate that CPs capture local structural patterns unique to each domain, and thus comparing CPs of hypergraphs addresses Q2 and Q3. The concept of CP is extended to represent the connectivity pattern of each node or hyperedge as a vector, which proves useful in node classification and hyperedge prediction. Our algorithmic contribution is to propose MoCHy, a family of parallel algorithms for counting h-motifs' occurrences in a hypergraph. We theoretically analyze their speed and accuracy and show empirically that the advanced approximate version MoCHy-A+ is more accurate and faster than the basic approximate and exact versions, respectively. Furthermore, we explore ternary hypergraph motifs that extends h-motifs by taking into account not only the presence but also the cardinality of intersections among hyperedges. This extension proves beneficial for all previously mentioned applications. △ Less

Submitted 24 October, 2023; originally announced October 2023.

Comments: Extended version of VLDB 2020 paper arXiv:2003.01853

arXiv:2310.14055 [pdf, other]

Spectral Phase Transitions in Non-Linear Wigner Spiked Models

Authors: Alice Guionnet, Justin Ko, Florent Krzakala, Pierre Mergny, Lenka Zdeborová

Abstract: We study the asymptotic behavior of the spectrum of a random matrix where a non-linearity is applied entry-wise to a Wigner matrix perturbed by a rank-one spike with independent and identically distributed entries. In this setting, we show that when the signal-to-noise ratio scale as $N^{\frac{1}{2} (1-1/k_\star)}$, where $k_\star$ is the first non-zero generalized information coefficient of the f… ▽ More We study the asymptotic behavior of the spectrum of a random matrix where a non-linearity is applied entry-wise to a Wigner matrix perturbed by a rank-one spike with independent and identically distributed entries. In this setting, we show that when the signal-to-noise ratio scale as $N^{\frac{1}{2} (1-1/k_\star)}$, where $k_\star$ is the first non-zero generalized information coefficient of the function, the non-linear spike model effectively behaves as an equivalent spiked Wigner matrix, where the former spike before the non-linearity is now raised to a power $k_\star$. This allows us to study the phase transition of the leading eigenvalues, generalizing part of the work of Baik, Ben Arous and Peché to these non-linear models. △ Less

Submitted 21 October, 2023; originally announced October 2023.

Comments: 27 pages

MSC Class: 60B20

arXiv:2310.10054 [pdf, other]

NASH: A Simple Unified Framework of Structured Pruning for Accelerating Encoder-Decoder Language Models

Authors: Jongwoo Ko, Seungjoon Park, Yujin Kim, Sumyeong Ahn, Du-Seong Chang, Euijai Ahn, Se-Young Yun

Abstract: Structured pruning methods have proven effective in reducing the model size and accelerating inference speed in various network architectures such as Transformers. Despite the versatility of encoder-decoder models in numerous NLP tasks, the structured pruning methods on such models are relatively less explored compared to encoder-only models. In this study, we investigate the behavior of the struc… ▽ More Structured pruning methods have proven effective in reducing the model size and accelerating inference speed in various network architectures such as Transformers. Despite the versatility of encoder-decoder models in numerous NLP tasks, the structured pruning methods on such models are relatively less explored compared to encoder-only models. In this study, we investigate the behavior of the structured pruning of the encoder-decoder models in the decoupled pruning perspective of the encoder and decoder component, respectively. Our findings highlight two insights: (1) the number of decoder layers is the dominant factor of inference speed, and (2) low sparsity in the pruned encoder network enhances generation quality. Motivated by these findings, we propose a simple and effective framework, NASH, that narrows the encoder and shortens the decoder networks of encoder-decoder models. Extensive experiments on diverse generation and inference tasks validate the effectiveness of our method in both speedup and output quality. △ Less

Submitted 16 October, 2023; originally announced October 2023.

Comments: Findings of the Association for Computational Linguistics: EMNLP 2023

arXiv:2310.07498 [pdf, other]

Low-mass Quiescent Galaxies Are Small in Isolated Environments: Environmental Dependence of the Mass-Size Relation of Low-mass Quiescent Galaxies

Authors: Yongmin Yoon, Jae-Woo Kim, Jongwan Ko

Abstract: We study the mass-size relation of quiescent galaxies across various environments, with a particular focus on its environmental dependence at the low-mass part of $\log(M_\mathrm{star}/M_{\odot})\lesssim10.0$. Our sample consists of 13,667 quiescent galaxies with $\log(M_\mathrm{star}/M_{\odot})\ge9.4$ and $0.01<z<0.04$ from the Sloan Digital Sky Survey. We find that the mass-size relation of low-… ▽ More We study the mass-size relation of quiescent galaxies across various environments, with a particular focus on its environmental dependence at the low-mass part of $\log(M_\mathrm{star}/M_{\odot})\lesssim10.0$. Our sample consists of 13,667 quiescent galaxies with $\log(M_\mathrm{star}/M_{\odot})\ge9.4$ and $0.01<z<0.04$ from the Sloan Digital Sky Survey. We find that the mass-size relation of low-mass quiescent galaxies (LQGs) with $\log(M_\mathrm{star}/M_{\odot})\lesssim10.0$ depends on their environment, with LQGs in the highest-density environments exhibiting an average size $\sim70\%$ larger than those in isolated environments. Moreover, the slope of the mass-size relation for LQGs in high-density environments is significantly shallower than that of their counterparts in isolated environments. This is in contrast with high-mass quiescent galaxies with $\log(M_\mathrm{star}/M_{\odot})\gtrsim10.5$ that show a nearly identical mass-size relation across all environments. Combined with additional discoveries that the mass-size relation slopes of LQGs and star-forming galaxies are similar to each other in high-density environments, and that LQGs in higher-density environments exhibit more disk-like structures, our results support the idea that LQGs in high-density environments have evolved from star-forming galaxies through environmental effects, which are capable of causing their quenching and transformation into quiescent galaxies. With the aid of an analysis of merger rates for simulated galaxies from a cosmological galaxy formation simulation, we suggest that the steep slope and low normalization of the mass-size relation of LQGs in the lowest-density environments may originate from recent gas-rich mergers, which occur over 10-30 times more frequently in the progenitors of LQGs in the lowest-density environments than in their counterparts in high-density environments at low redshifts. △ Less

Submitted 11 October, 2023; originally announced October 2023.

Comments: 20 pages, 14 figures, 1 table, accepted for publication in the ApJ

arXiv:2310.06511 [pdf, other]

Self-Supervised Dataset Distillation for Transfer Learning

Authors: Dong Bok Lee, Seanie Lee, Joonho Ko, Kenji Kawaguchi, Juho Lee, Sung Ju Hwang

Abstract: Dataset distillation methods have achieved remarkable success in distilling a large dataset into a small set of representative samples. However, they are not designed to produce a distilled dataset that can be effectively used for facilitating self-supervised pre-training. To this end, we propose a novel problem of distilling an unlabeled dataset into a set of small synthetic samples for efficient… ▽ More Dataset distillation methods have achieved remarkable success in distilling a large dataset into a small set of representative samples. However, they are not designed to produce a distilled dataset that can be effectively used for facilitating self-supervised pre-training. To this end, we propose a novel problem of distilling an unlabeled dataset into a set of small synthetic samples for efficient self-supervised learning (SSL). We first prove that a gradient of synthetic samples with respect to a SSL objective in naive bilevel optimization is \textit{biased} due to the randomness originating from data augmentations or masking. To address this issue, we propose to minimize the mean squared error (MSE) between a model's representations of the synthetic examples and their corresponding learnable target feature representations for the inner objective, which does not introduce any randomness. Our primary motivation is that the model obtained by the proposed inner optimization can mimic the \textit{self-supervised target model}. To achieve this, we also introduce the MSE between representations of the inner model and the self-supervised target model on the original full dataset for outer optimization. Lastly, assuming that a feature extractor is fixed, we only optimize a linear head on top of the feature extractor, which allows us to reduce the computational cost and obtain a closed-form solution of the head with kernel ridge regression. We empirically validate the effectiveness of our method on various applications involving transfer learning. △ Less

Submitted 11 April, 2024; v1 submitted 10 October, 2023; originally announced October 2023.

arXiv:2310.05424 [pdf, other]

Fast and Robust Early-Exiting Framework for Autoregressive Language Models with Synchronized Parallel Decoding

Authors: Sangmin Bae, Jongwoo Ko, Hwanjun Song, Se-Young Yun

Abstract: To tackle the high inference latency exhibited by autoregressive language models, previous studies have proposed an early-exiting framework that allocates adaptive computation paths for each token based on the complexity of generating the subsequent token. However, we observed several shortcomings, including performance degradation caused by a state copying mechanism or numerous exit paths, and se… ▽ More To tackle the high inference latency exhibited by autoregressive language models, previous studies have proposed an early-exiting framework that allocates adaptive computation paths for each token based on the complexity of generating the subsequent token. However, we observed several shortcomings, including performance degradation caused by a state copying mechanism or numerous exit paths, and sensitivity to exit confidence thresholds. Consequently, we propose a Fast and Robust Early-Exiting (FREE) framework, which incorporates a shallow-deep module and a synchronized parallel decoding. Our framework enables faster inference by synchronizing the decoding process of the current token with previously stacked early-exited tokens. Furthermore, as parallel decoding allows us to observe predictions from both shallow and deep models, we present a novel adaptive threshold estimator that exploits a Beta mixture model to determine suitable confidence thresholds. We empirically demonstrated the superiority of our proposed framework on extensive generation tasks. △ Less

Submitted 9 October, 2023; originally announced October 2023.

Comments: EMNLP 2023 (Long)

arXiv:2310.02823 [pdf, other]

Learning to Scale Logits for Temperature-Conditional GFlowNets

Authors: Minsu Kim, Joohwan Ko, Taeyoung Yun, Dinghuai Zhang, Ling Pan, Woochang Kim, Jinkyoo Park, Emmanuel Bengio, Yoshua Bengio

Abstract: GFlowNets are probabilistic models that sequentially generate compositional structures through a stochastic policy. Among GFlowNets, temperature-conditional GFlowNets can introduce temperature-based controllability for exploration and exploitation. We propose \textit{Logit-scaling GFlowNets} (Logit-GFN), a novel architectural design that greatly accelerates the training of temperature-conditional… ▽ More GFlowNets are probabilistic models that sequentially generate compositional structures through a stochastic policy. Among GFlowNets, temperature-conditional GFlowNets can introduce temperature-based controllability for exploration and exploitation. We propose \textit{Logit-scaling GFlowNets} (Logit-GFN), a novel architectural design that greatly accelerates the training of temperature-conditional GFlowNets. It is based on the idea that previously proposed approaches introduced numerical challenges in the deep network training, since different temperatures may give rise to very different gradient profiles as well as magnitudes of the policy's logits. We find that the challenge is greatly reduced if a learned function of the temperature is used to scale the policy's logits directly. Also, using Logit-GFN, GFlowNets can be improved by having better generalization capabilities in offline learning and mode discovery capabilities in online learning, which is empirically verified in various biological and chemical tasks. Our code is available at \url{https://github.com/dbsxodud-11/logit-gfn} △ Less

Submitted 2 June, 2024; v1 submitted 4 October, 2023; originally announced October 2023.

Comments: ICML 2024, 23 pages, 21 figures

arXiv:2310.00109 [pdf, other]

FedAIoT: A Federated Learning Benchmark for Artificial Intelligence of Things

Authors: Samiul Alam, Tuo Zhang, Tiantian Feng, Hui Shen, Zhichao Cao, Dong Zhao, JeongGil Ko, Kiran Somasundaram, Shrikanth S. Narayanan, Salman Avestimehr, Mi Zhang

Abstract: There is a significant relevance of federated learning (FL) in the realm of Artificial Intelligence of Things (AIoT). However, most existing FL works do not use datasets collected from authentic IoT devices and thus do not capture unique modalities and inherent challenges of IoT data. To fill this critical gap, in this work, we introduce FedAIoT, an FL benchmark for AIoT. FedAIoT includes eight da… ▽ More There is a significant relevance of federated learning (FL) in the realm of Artificial Intelligence of Things (AIoT). However, most existing FL works do not use datasets collected from authentic IoT devices and thus do not capture unique modalities and inherent challenges of IoT data. To fill this critical gap, in this work, we introduce FedAIoT, an FL benchmark for AIoT. FedAIoT includes eight datasets collected from a wide range of IoT devices. These datasets cover unique IoT modalities and target representative applications of AIoT. FedAIoT also includes a unified end-to-end FL framework for AIoT that simplifies benchmarking the performance of the datasets. Our benchmark results shed light on the opportunities and challenges of FL for AIoT. We hope FedAIoT could serve as an invaluable resource to foster advancements in the important field of FL for AIoT. The repository of FedAIoT is maintained at https://github.com/AIoT-MLSys-Lab/FedAIoT. △ Less

Submitted 21 August, 2024; v1 submitted 29 September, 2023; originally announced October 2023.

Comments: Camera-ready version of the Journal of Data-centric Machine Learning Research (DMLR)

arXiv:2309.10310 [pdf, other]

TensorCodec: Compact Lossy Compression of Tensors without Strong Data Assumptions

Authors: Taehyung Kwon, Jihoon Ko, Jinhong Jung, Kijung Shin

Abstract: Many real-world datasets are represented as tensors, i.e., multi-dimensional arrays of numerical values. Storing them without compression often requires substantial space, which grows exponentially with the order. While many tensor compression algorithms are available, many of them rely on strong data assumptions regarding its order, sparsity, rank, and smoothness. In this work, we propose TENSORC… ▽ More Many real-world datasets are represented as tensors, i.e., multi-dimensional arrays of numerical values. Storing them without compression often requires substantial space, which grows exponentially with the order. While many tensor compression algorithms are available, many of them rely on strong data assumptions regarding its order, sparsity, rank, and smoothness. In this work, we propose TENSORCODEC, a lossy compression algorithm for general tensors that do not necessarily adhere to strong input data assumptions. TENSORCODEC incorporates three key ideas. The first idea is Neural Tensor-Train Decomposition (NTTD) where we integrate a recurrent neural network into Tensor-Train Decomposition to enhance its expressive power and alleviate the limitations imposed by the low-rank assumption. Another idea is to fold the input tensor into a higher-order tensor to reduce the space required by NTTD. Finally, the mode indices of the input tensor are reordered to reveal patterns that can be exploited by NTTD for improved approximation. Our analysis and experiments on 8 real-world datasets demonstrate that TENSORCODEC is (a) Concise: it gives up to 7.38x more compact compression than the best competitor with similar reconstruction error, (b) Accurate: given the same budget for compressed size, it yields up to 3.33x more accurate reconstruction than the best competitor, (c) Scalable: its empirical compression time is linear in the number of tensor entries, and it reconstructs each entry in logarithmic time. Our code and datasets are available at https://github.com/kbrother/TensorCodec. △ Less

Submitted 20 September, 2023; v1 submitted 19 September, 2023; originally announced September 2023.

Comments: Accepted to ICDM 2023 - IEEE International Conference on Data Mining 2023

arXiv:2309.10069 [pdf]

Sex-based Disparities in Brain Aging: A Focus on Parkinson's Disease

Authors: Iman Beheshti, Samuel Booth, Ji Hyun Ko

Abstract: PD is linked to faster brain aging. Sex is recognized as an important factor in PD, such that males are twice as likely as females to have the disease and have more severe symptoms and a faster progression rate. Despite previous research, there remains a significant gap in understanding the function of sex in the process of brain aging in PD patients. The T1-weighted MRI-driven brain-predicted age… ▽ More PD is linked to faster brain aging. Sex is recognized as an important factor in PD, such that males are twice as likely as females to have the disease and have more severe symptoms and a faster progression rate. Despite previous research, there remains a significant gap in understanding the function of sex in the process of brain aging in PD patients. The T1-weighted MRI-driven brain-predicted age difference was computed in a group of 373 PD patients from the PPMI database using a robust brain-age estimation framework that was trained on 949 healthy subjects. Linear regression models were used to investigate the association between brain-PAD and clinical variables in PD, stratified by sex. All female PD patients were used in the correlational analysis while the same number of males were selected based on propensity score matching method considering age, education level, age of symptom onset, and clinical symptom severity. Despite both patient groups being matched for demographics, motor and non-motor symptoms, it was observed that males with Parkinson's disease exhibited a significantly higher mean brain age-delta than their female counterparts . In the propensity score-matched PD male group, brain-PAD was found to be associated with a decline in general cognition, a worse degree of sleep behavior disorder, reduced visuospatial acuity, and caudate atrophy. Conversely, no significant links were observed between these factors and brain-PAD in the PD female group. △ Less

Submitted 18 September, 2023; originally announced September 2023.

Comments: 35 pages, 5 figures

arXiv:2309.07471 [pdf, other]

EP2P-Loc: End-to-End 3D Point to 2D Pixel Localization for Large-Scale Visual Localization

Authors: Minjung Kim, Junseo Koo, Gunhee Kim

Abstract: Visual localization is the task of estimating a 6-DoF camera pose of a query image within a provided 3D reference map. Thanks to recent advances in various 3D sensors, 3D point clouds are becoming a more accurate and affordable option for building the reference map, but research to match the points of 3D point clouds with pixels in 2D images for visual localization remains challenging. Existing ap… ▽ More Visual localization is the task of estimating a 6-DoF camera pose of a query image within a provided 3D reference map. Thanks to recent advances in various 3D sensors, 3D point clouds are becoming a more accurate and affordable option for building the reference map, but research to match the points of 3D point clouds with pixels in 2D images for visual localization remains challenging. Existing approaches that jointly learn 2D-3D feature matching suffer from low inliers due to representational differences between the two modalities, and the methods that bypass this problem into classification have an issue of poor refinement. In this work, we propose EP2P-Loc, a novel large-scale visual localization method that mitigates such appearance discrepancy and enables end-to-end training for pose estimation. To increase the number of inliers, we propose a simple algorithm to remove invisible 3D points in the image, and find all 2D-3D correspondences without keypoint detection. To reduce memory usage and search complexity, we take a coarse-to-fine approach where we extract patch-level features from 2D images, then perform 2D patch classification on each 3D point, and obtain the exact corresponding 2D pixel coordinates through positional encoding. Finally, for the first time in this task, we employ a differentiable PnP for end-to-end training. In the experiments on newly curated large-scale indoor and outdoor benchmarks based on 2D-3D-S and KITTI, we show that our method achieves the state-of-the-art performance compared to existing visual localization and image-to-point cloud registration methods. △ Less

Submitted 14 September, 2023; originally announced September 2023.

Comments: Accepted to ICCV 2023

arXiv:2308.12599 [pdf, other]

Exploiting Time-Frequency Conformers for Music Audio Enhancement

Authors: Yunkee Chae, Junghyun Koo, Sungho Lee, Kyogu Lee

Abstract: With the proliferation of video platforms on the internet, recording musical performances by mobile devices has become commonplace. However, these recordings often suffer from degradation such as noise and reverberation, which negatively impact the listening experience. Consequently, the necessity for music audio enhancement (referred to as music enhancement from this point onward), involving the… ▽ More With the proliferation of video platforms on the internet, recording musical performances by mobile devices has become commonplace. However, these recordings often suffer from degradation such as noise and reverberation, which negatively impact the listening experience. Consequently, the necessity for music audio enhancement (referred to as music enhancement from this point onward), involving the transformation of degraded audio recordings into pristine high-quality music, has surged to augment the auditory experience. To address this issue, we propose a music enhancement system based on the Conformer architecture that has demonstrated outstanding performance in speech enhancement tasks. Our approach explores the attention mechanisms of the Conformer and examines their performance to discover the best approach for the music enhancement task. Our experimental results show that our proposed model achieves state-of-the-art performance on single-stem music enhancement. Furthermore, our system can perform general music enhancement with multi-track mixtures, which has not been examined in previous work. △ Less

Submitted 24 August, 2023; originally announced August 2023.

Comments: Accepted by ACM Multimedia 2023

arXiv:2308.11916 [pdf, other]

Semantic-Aware Implicit Template Learning via Part Deformation Consistency

Authors: Sihyeon Kim, Minseok Joo, Jaewon Lee, Juyeon Ko, Juhan Cha, Hyunwoo J. Kim

Abstract: Learning implicit templates as neural fields has recently shown impressive performance in unsupervised shape correspondence. Despite the success, we observe current approaches, which solely rely on geometric information, often learn suboptimal deformation across generic object shapes, which have high structural variability. In this paper, we highlight the importance of part deformation consistency… ▽ More Learning implicit templates as neural fields has recently shown impressive performance in unsupervised shape correspondence. Despite the success, we observe current approaches, which solely rely on geometric information, often learn suboptimal deformation across generic object shapes, which have high structural variability. In this paper, we highlight the importance of part deformation consistency and propose a semantic-aware implicit template learning framework to enable semantically plausible deformation. By leveraging semantic prior from a self-supervised feature extractor, we suggest local conditioning with novel semantic-aware deformation code and deformation consistency regularizations regarding part deformation, global deformation, and global scaling. Our extensive experiments demonstrate the superiority of the proposed method over baselines in various tasks: keypoint transfer, part label transfer, and texture transfer. More interestingly, our framework shows a larger performance gain under more challenging settings. We also provide qualitative analyses to validate the effectiveness of semantic-aware deformation. The code is available at https://github.com/mlvlab/PDC. △ Less

Submitted 23 August, 2023; originally announced August 2023.

Comments: ICCV camera-ready version

arXiv:2308.11250 [pdf, ps, other]

Class fields and form class groups for solving certain quadratic Diophantine equations

Authors: Ho Yun Jung, Ja Kyung Koo, Dong Hwa Shin, Dong Sung Yoon

Abstract: Let $K$ be an imaginary quadratic field and $\mathcal{O}$ be an order in $K$. We construct class fields associated with form class groups which are isomorphic to certain $\mathcal{O}$-ideal class groups in terms of the theory of canonical models due to Shimura. As its applications, by using such class fields, for a positive integer $n$ we first find primes of the form $x^2+ny^2$ with additional co… ▽ More Let $K$ be an imaginary quadratic field and $\mathcal{O}$ be an order in $K$. We construct class fields associated with form class groups which are isomorphic to certain $\mathcal{O}$-ideal class groups in terms of the theory of canonical models due to Shimura. As its applications, by using such class fields, for a positive integer $n$ we first find primes of the form $x^2+ny^2$ with additional conditions on $x$ and $y$. Second, by utilizing these form class groups, we derive a congruence relation on special values of a modular function of higher level as an analogue of Kronecker's congruence relation. △ Less

Submitted 24 February, 2024; v1 submitted 22 August, 2023; originally announced August 2023.

Comments: 30 pages, The title has been changed

MSC Class: Primary 11R37; Secondary 11E12; 11R65

Showing 101–150 of 515 results for author: Koo, J