-
Accelerated Evaluation of Ollivier-Ricci Curvature Lower Bounds: Bridging Theory and Computation
Authors:
Wonwoo Kang,
Heehyun Park
Abstract:
Curvature serves as a potent and descriptive invariant, with its efficacy validated both theoretically and practically within graph theory. We employ a definition of generalized Ricci curvature proposed by Ollivier, which Lin and Yau later adapted to graph theory, known as Ollivier-Ricci curvature (ORC). ORC measures curvature using the Wasserstein distance, thereby integrating geometric concepts…
▽ More
Curvature serves as a potent and descriptive invariant, with its efficacy validated both theoretically and practically within graph theory. We employ a definition of generalized Ricci curvature proposed by Ollivier, which Lin and Yau later adapted to graph theory, known as Ollivier-Ricci curvature (ORC). ORC measures curvature using the Wasserstein distance, thereby integrating geometric concepts with probability theory and optimal transport. Jost and Liu previously discussed the lower bound of ORC by showing the upper bound of the Wasserstein distance. We extend the applicability of these bounds to discrete spaces with metrics on integers, specifically hypergraphs. Compared to prior work on ORC in hypergraphs by Coupette, Dalleiger, and Rieck, which faced computational challenges, our method introduces a simplified approach with linear computational complexity, making it particularly suitable for analyzing large-scale networks. Through extensive simulations and application to synthetic and real-world datasets, we demonstrate the significant improvements our method offers in evaluating ORC.
△ Less
Submitted 21 May, 2024;
originally announced May 2024.
-
The Association Between SOC and Land Prices Considering Spatial Heterogeneity Based on Finite Mixture Modeling
Authors:
Woo Seok Kang,
Eunchan Kim,
Wookjae Heo
Abstract:
An understanding of how Social Overhead Capital (SOC) is associated with the land value of the local community is important for effective urban planning. However, even within a district, there are multiple sections used for different purposes; the term for this is spatial heterogeneity. The spatial heterogeneity issue has to be considered when attempting to comprehend land prices. If there is spat…
▽ More
An understanding of how Social Overhead Capital (SOC) is associated with the land value of the local community is important for effective urban planning. However, even within a district, there are multiple sections used for different purposes; the term for this is spatial heterogeneity. The spatial heterogeneity issue has to be considered when attempting to comprehend land prices. If there is spatial heterogeneity within a district, land prices can be managed by adopting the spatial clustering method. In this study, spatial attributes including SOC, socio-demographic features, and spatial information in a specific district are analyzed with Finite Mixture Modeling (FMM) in order to find (a) the optimal number of clusters and (b) the association among SOCs, socio-demographic features, and land prices. FMM is a tool used to find clusters and the attributes' coefficients simultaneously. Using the FMM method, the results show that four clusters exist in one district and the four clusters have different associations among SOCs, demographic features, and land prices. Policymakers and managerial administration need to look for information to make policy about land prices. The current study finds the consideration of closeness to SOC to be a significant factor on land prices and suggests the potential policy direction related to SOC.
△ Less
Submitted 15 November, 2022;
originally announced November 2022.
-
Soft Truncation: A Universal Training Technique of Score-based Diffusion Model for High Precision Score Estimation
Authors:
Dongjun Kim,
Seungjae Shin,
Kyungwoo Song,
Wanmo Kang,
Il-Chul Moon
Abstract:
Recent advances in diffusion models bring state-of-the-art performance on image generation tasks. However, empirical results from previous research in diffusion models imply an inverse correlation between density estimation and sample generation performances. This paper investigates with sufficient empirical evidence that such inverse correlation happens because density estimation is significantly…
▽ More
Recent advances in diffusion models bring state-of-the-art performance on image generation tasks. However, empirical results from previous research in diffusion models imply an inverse correlation between density estimation and sample generation performances. This paper investigates with sufficient empirical evidence that such inverse correlation happens because density estimation is significantly contributed by small diffusion time, whereas sample generation mainly depends on large diffusion time. However, training a score network well across the entire diffusion time is demanding because the loss scale is significantly imbalanced at each diffusion time. For successful training, therefore, we introduce Soft Truncation, a universally applicable training technique for diffusion models, that softens the fixed and static truncation hyperparameter into a random variable. In experiments, Soft Truncation achieves state-of-the-art performance on CIFAR-10, CelebA, CelebA-HQ 256x256, and STL-10 datasets.
△ Less
Submitted 10 June, 2022; v1 submitted 10 June, 2021;
originally announced June 2021.
-
Neural Posterior Regularization for Likelihood-Free Inference
Authors:
Dongjun Kim,
Kyungwoo Song,
Seungjae Shin,
Wanmo Kang,
Il-Chul Moon,
Weonyoung Joo
Abstract:
A simulation is useful when the phenomenon of interest is either expensive to regenerate or irreproducible with the same context. Recently, Bayesian inference on the distribution of the simulation input parameter has been implemented sequentially to minimize the required simulation budget for the task of simulation validation to the real-world. However, the Bayesian inference is still challenging…
▽ More
A simulation is useful when the phenomenon of interest is either expensive to regenerate or irreproducible with the same context. Recently, Bayesian inference on the distribution of the simulation input parameter has been implemented sequentially to minimize the required simulation budget for the task of simulation validation to the real-world. However, the Bayesian inference is still challenging when the ground-truth posterior is multi-modal with a high-dimensional simulation output. This paper introduces a regularization technique, namely Neural Posterior Regularization (NPR), which enforces the model to explore the input parameter space effectively. Afterward, we provide the closed-form solution of the regularized optimization that enables analyzing the effect of the regularization. We empirically validate that NPR attains the statistically significant gain on benchmark performances for diverse simulation tasks.
△ Less
Submitted 3 November, 2022; v1 submitted 15 February, 2021;
originally announced February 2021.
-
Sequential Likelihood-Free Inference with Neural Proposal
Authors:
Dongjun Kim,
Kyungwoo Song,
YoonYeong Kim,
Yongjin Shin,
Wanmo Kang,
Il-Chul Moon,
Weonyoung Joo
Abstract:
Bayesian inference without the likelihood evaluation, or likelihood-free inference, has been a key research topic in simulation studies for gaining quantitatively validated simulation models on real-world datasets. As the likelihood evaluation is inaccessible, previous papers train the amortized neural network to estimate the ground-truth posterior for the simulation of interest. Training the netw…
▽ More
Bayesian inference without the likelihood evaluation, or likelihood-free inference, has been a key research topic in simulation studies for gaining quantitatively validated simulation models on real-world datasets. As the likelihood evaluation is inaccessible, previous papers train the amortized neural network to estimate the ground-truth posterior for the simulation of interest. Training the network and accumulating the dataset alternatively in a sequential manner could save the total simulation budget by orders of magnitude. In the data accumulation phase, the new simulation inputs are chosen within a portion of the total simulation budget to accumulate upon the collected dataset. This newly accumulated data degenerates because the set of simulation inputs is hardly mixed, and this degenerated data collection process ruins the posterior inference. This paper introduces a new sampling approach, called Neural Proposal (NP), of the simulation input that resolves the biased data collection as it guarantees the i.i.d. sampling. The experiments show the improved performance of our sampler, especially for the simulations with multi-modal posteriors.
△ Less
Submitted 4 November, 2022; v1 submitted 15 October, 2020;
originally announced October 2020.
-
Density Propagation with Characteristics-based Deep Learning
Authors:
Tenavi Nakamura-Zimmerer,
Daniele Venturi,
Qi Gong,
Wei Kang
Abstract:
Uncertainty propagation in nonlinear dynamic systems remains an outstanding problem in scientific computing and control. Numerous approaches have been developed, but are limited in their capability to tackle problems with more than a few uncertain variables or require large amounts of simulation data. In this paper, we propose a data-driven method for approximating joint probability density functi…
▽ More
Uncertainty propagation in nonlinear dynamic systems remains an outstanding problem in scientific computing and control. Numerous approaches have been developed, but are limited in their capability to tackle problems with more than a few uncertain variables or require large amounts of simulation data. In this paper, we propose a data-driven method for approximating joint probability density functions (PDFs) of nonlinear dynamic systems with initial condition and parameter uncertainty. Our approach leverages on the power of deep learning to deal with high-dimensional inputs, but we overcome the need for huge quantities of training data by encoding PDF evolution equations directly into the optimization problem. We demonstrate the potential of the proposed method by applying it to evaluate the robustness of a feedback controller for a six-dimensional rigid body with parameter uncertainty.
△ Less
Submitted 21 November, 2019;
originally announced November 2019.
-
Mixout: Effective Regularization to Finetune Large-scale Pretrained Language Models
Authors:
Cheolhyoung Lee,
Kyunghyun Cho,
Wanmo Kang
Abstract:
In natural language processing, it has been observed recently that generalization could be greatly improved by finetuning a large-scale language model pretrained on a large unlabeled corpus. Despite its recent success and wide adoption, finetuning a large pretrained language model on a downstream task is prone to degenerate performance when there are only a small number of training instances avail…
▽ More
In natural language processing, it has been observed recently that generalization could be greatly improved by finetuning a large-scale language model pretrained on a large unlabeled corpus. Despite its recent success and wide adoption, finetuning a large pretrained language model on a downstream task is prone to degenerate performance when there are only a small number of training instances available. In this paper, we introduce a new regularization technique, to which we refer as "mixout", motivated by dropout. Mixout stochastically mixes the parameters of two models. We show that our mixout technique regularizes learning to minimize the deviation from one of the two models and that the strength of regularization adapts along the optimization trajectory. We empirically evaluate the proposed mixout and its variants on finetuning a pretrained language model on downstream tasks. More specifically, we demonstrate that the stability of finetuning and the average accuracy greatly increase when we use the proposed approach to regularize finetuning of BERT on downstream tasks in GLUE.
△ Less
Submitted 22 January, 2020; v1 submitted 25 September, 2019;
originally announced September 2019.
-
QTRAN: Learning to Factorize with Transformation for Cooperative Multi-Agent Reinforcement Learning
Authors:
Kyunghwan Son,
Daewoo Kim,
Wan Ju Kang,
David Earl Hostallero,
Yung Yi
Abstract:
We explore value-based solutions for multi-agent reinforcement learning (MARL) tasks in the centralized training with decentralized execution (CTDE) regime popularized recently. However, VDN and QMIX are representative examples that use the idea of factorization of the joint action-value function into individual ones for decentralized execution. VDN and QMIX address only a fraction of factorizable…
▽ More
We explore value-based solutions for multi-agent reinforcement learning (MARL) tasks in the centralized training with decentralized execution (CTDE) regime popularized recently. However, VDN and QMIX are representative examples that use the idea of factorization of the joint action-value function into individual ones for decentralized execution. VDN and QMIX address only a fraction of factorizable MARL tasks due to their structural constraint in factorization such as additivity and monotonicity. In this paper, we propose a new factorization method for MARL, QTRAN, which is free from such structural constraints and takes on a new approach to transforming the original joint action-value function into an easily factorizable one, with the same optimal actions. QTRAN guarantees more general factorization than VDN or QMIX, thus covering a much wider class of MARL tasks than does previous methods. Our experiments for the tasks of multi-domain Gaussian-squeeze and modified predator-prey demonstrate QTRAN's superior performance with especially larger margins in games whose payoffs penalize non-cooperative behavior more aggressively.
△ Less
Submitted 14 May, 2019;
originally announced May 2019.
-
Directional Analysis of Stochastic Gradient Descent via von Mises-Fisher Distributions in Deep learning
Authors:
Cheolhyoung Lee,
Kyunghyun Cho,
Wanmo Kang
Abstract:
Although stochastic gradient descent (SGD) is a driving force behind the recent success of deep learning, our understanding of its dynamics in a high-dimensional parameter space is limited. In recent years, some researchers have used the stochasticity of minibatch gradients, or the signal-to-noise ratio, to better characterize the learning dynamics of SGD. Inspired from these work, we here analyze…
▽ More
Although stochastic gradient descent (SGD) is a driving force behind the recent success of deep learning, our understanding of its dynamics in a high-dimensional parameter space is limited. In recent years, some researchers have used the stochasticity of minibatch gradients, or the signal-to-noise ratio, to better characterize the learning dynamics of SGD. Inspired from these work, we here analyze SGD from a geometrical perspective by inspecting the stochasticity of the norms and directions of minibatch gradients. We propose a model of the directional concentration for minibatch gradients through von Mises-Fisher (VMF) distribution, and show that the directional uniformity of minibatch gradients increases over the course of SGD. We empirically verify our result using deep convolutional networks and observe a higher correlation between the gradient stochasticity and the proposed directional uniformity than that against the gradient norm stochasticity, suggesting that the directional statistics of minibatch gradients is a major factor behind SGD.
△ Less
Submitted 28 November, 2018; v1 submitted 29 September, 2018;
originally announced October 2018.