Skip to main content

Showing 1–33 of 33 results for author: Kim, K I

Searching in archive cs. Search in all archives.
.
  1. arXiv:2507.06125  [pdf, ps, other

    cs.LG cs.AI

    Subspace-based Approximate Hessian Method for Zeroth-Order Optimization

    Authors: Dongyoon Kim, Sungjae Lee, Wonjin Lee, Kwang In Kim

    Abstract: Zeroth-order optimization addresses problems where gradient information is inaccessible or impractical to compute. While most existing methods rely on first-order approximations, incorporating second-order (curvature) information can, in principle, significantly accelerate convergence. However, the high cost of function evaluations required to estimate Hessian matrices often limits practical appli… ▽ More

    Submitted 8 July, 2025; originally announced July 2025.

    Comments: 20 pages, 8 figures

  2. arXiv:2505.17475  [pdf, ps, other

    cs.CV

    PoseBH: Prototypical Multi-Dataset Training Beyond Human Pose Estimation

    Authors: Uyoung Jeong, Jonathan Freer, Seungryul Baek, Hyung Jin Chang, Kwang In Kim

    Abstract: We study multi-dataset training (MDT) for pose estimation, where skeletal heterogeneity presents a unique challenge that existing methods have yet to address. In traditional domains, \eg regression and classification, MDT typically relies on dataset merging or multi-head supervision. However, the diversity of skeleton types and limited cross-dataset supervision complicate integration in pose estim… ▽ More

    Submitted 23 May, 2025; originally announced May 2025.

    Comments: accepted to CVPR 2025

  3. arXiv:2503.15035  [pdf, other

    cs.AI cs.RO

    GraspCorrect: Robotic Grasp Correction via Vision-Language Model-Guided Feedback

    Authors: Sungjae Lee, Yeonjoo Hong, Kwang In Kim

    Abstract: Despite significant advancements in robotic manipulation, achieving consistent and stable grasping remains a fundamental challenge, often limiting the successful execution of complex tasks. Our analysis reveals that even state-of-the-art policy models frequently exhibit unstable grasping behaviors, leading to failure cases that create bottlenecks in real-world robotic applications. To address thes… ▽ More

    Submitted 19 March, 2025; originally announced March 2025.

  4. arXiv:2412.07629  [pdf, other

    cs.CL cs.AI

    Piece of Table: A Divide-and-Conquer Approach for Selecting Subtables in Table Question Answering

    Authors: Wonjin Lee, Kyumin Kim, Sungjae Lee, Jihun Lee, Kwang In Kim

    Abstract: Applying language models (LMs) to tables is challenging due to the inherent structural differences between two-dimensional tables and one-dimensional text for which the LMs were originally designed. Furthermore, when applying linearized tables to LMs, the maximum token lengths often imposed in self-attention calculations make it difficult to comprehensively understand the context spread across lar… ▽ More

    Submitted 19 February, 2025; v1 submitted 10 December, 2024; originally announced December 2024.

  5. arXiv:2406.04772  [pdf, other

    cs.LG cs.AI cs.CV

    REP: Resource-Efficient Prompting for Rehearsal-Free Continual Learning

    Authors: Sungho Jeon, Xinyue Ma, Kwang In Kim, Myeongjae Jeon

    Abstract: Recent rehearsal-free methods, guided by prompts, excel in vision-related continual learning (CL) with drifting data but lack resource efficiency, making real-world deployment challenging. In this paper, we introduce Resource-Efficient Prompting (REP), which improves the computational and memory efficiency of prompt-based rehearsal-free methods while minimizing accuracy trade-offs. Our approach em… ▽ More

    Submitted 16 February, 2025; v1 submitted 7 June, 2024; originally announced June 2024.

  6. arXiv:2311.17094  [pdf, other

    cs.LG cs.CV

    In Search of a Data Transformation That Accelerates Neural Field Training

    Authors: Junwon Seo, Sangyoon Lee, Kwang In Kim, Jaeho Lee

    Abstract: Neural field is an emerging paradigm in data representation that trains a neural network to approximate the given signal. A key obstacle that prevents its widespread adoption is the encoding speed-generating neural fields requires an overfitting of a neural network, which can take a significant number of SGD steps to reach the desired fidelity level. In this paper, we delve into the impacts of dat… ▽ More

    Submitted 26 March, 2024; v1 submitted 28 November, 2023; originally announced November 2023.

    Comments: CVPR 2024

  7. arXiv:2309.14072  [pdf, other

    cs.CV

    BoIR: Box-Supervised Instance Representation for Multi-Person Pose Estimation

    Authors: Uyoung Jeong, Seungryul Baek, Hyung Jin Chang, Kwang In Kim

    Abstract: Single-stage multi-person human pose estimation (MPPE) methods have shown great performance improvements, but existing methods fail to disentangle features by individual instances under crowded scenes. In this paper, we propose a bounding box-level instance representation learning called BoIR, which simultaneously solves instance detection, instance disentanglement, and instance-keypoint associati… ▽ More

    Submitted 2 November, 2023; v1 submitted 25 September, 2023; originally announced September 2023.

    Comments: Accepted to BMVC 2023, 19 pages including the appendix, 6 figures, 7 tables

    Journal ref: BMVC. 34 (2023) 763-764

  8. arXiv:2301.02761  [pdf, other

    cs.LG cs.CV

    Active Learning Guided by Efficient Surrogate Learners

    Authors: Yunpyo An, Suyeong Park, Kwang In Kim

    Abstract: Re-training a deep learning model each time a single data point receives a new label is impractical due to the inherent complexity of the training process. Consequently, existing active learning (AL) algorithms tend to adopt a batch-based approach where, during each AL iteration, a set of data points is collectively chosen for annotation. However, this strategy frequently leads to redundant sampli… ▽ More

    Submitted 17 December, 2023; v1 submitted 6 January, 2023; originally announced January 2023.

  9. arXiv:2208.00874  [pdf, other

    cs.CV

    S$^2$Contact: Graph-based Network for 3D Hand-Object Contact Estimation with Semi-Supervised Learning

    Authors: Tze Ho Elden Tse, Zhongqun Zhang, Kwang In Kim, Ales Leonardis, Feng Zheng, Hyung Jin Chang

    Abstract: Despite the recent efforts in accurate 3D annotations in hand and object datasets, there still exist gaps in 3D hand and object reconstructions. Existing works leverage contact maps to refine inaccurate hand-object pose estimations and generate grasps given object models. However, they require explicit 3D supervision which is seldom available and therefore, are limited to constrained settings, e.g… ▽ More

    Submitted 3 August, 2023; v1 submitted 1 August, 2022; originally announced August 2022.

    Comments: Accepted to ECCV 2022

  10. arXiv:2204.13062  [pdf, other

    cs.CV

    Collaborative Learning for Hand and Object Reconstruction with Attention-guided Graph Convolution

    Authors: Tze Ho Elden Tse, Kwang In Kim, Ales Leonardis, Hyung Jin Chang

    Abstract: Estimating the pose and shape of hands and objects under interaction finds numerous applications including augmented and virtual reality. Existing approaches for hand and object reconstruction require explicitly defined physical constraints and known objects, which limits its application domains. Our algorithm is agnostic to object models, and it learns the physical rules governing hand-object int… ▽ More

    Submitted 27 April, 2022; originally announced April 2022.

    Comments: Accepted to CVPR 2022

  11. arXiv:2111.02865  [pdf, other

    cs.LG cs.CV

    Testing using Privileged Information by Adapting Features with Statistical Dependence

    Authors: Kwang In Kim, James Tompkin

    Abstract: Given an imperfect predictor, we exploit additional features at test time to improve the predictions made, without retraining and without knowledge of the prediction function. This scenario arises if training labels or data are proprietary, restricted, or no longer available, or if training itself is prohibitively expensive. We assume that the additional features are useful if they exhibit strong… ▽ More

    Submitted 4 November, 2021; originally announced November 2021.

    Comments: Published at ICCV 2021. Webpage: http://www.jamestompkin.com/tupi

  12. arXiv:2107.07330  [pdf, other

    cs.CV

    DynaDog+T: A Parametric Animal Model for Synthetic Canine Image Generation

    Authors: Jake Deane, Sinead Kearney, Kwang In Kim, Darren Cosker

    Abstract: Synthetic data is becoming increasingly common for training computer vision models for a variety of tasks. Notably, such data has been applied in tasks related to humans such as 3D pose estimation where data is either difficult to create or obtain in realistic settings. Comparatively, there has been less work into synthetic animal data and it's uses for training models. Consequently, we introduce… ▽ More

    Submitted 20 July, 2021; v1 submitted 15 July, 2021; originally announced July 2021.

    Comments: CV4Animals Workshop in CVPR 2021. Update to correct minor spelling and grammer mistakes in supplementary material

  13. arXiv:2106.13215  [pdf

    cs.CV

    GaussiGAN: Controllable Image Synthesis with 3D Gaussians from Unposed Silhouettes

    Authors: Youssef A. Mejjati, Isa Milefchik, Aaron Gokaslan, Oliver Wang, Kwang In Kim, James Tompkin

    Abstract: We present an algorithm that learns a coarse 3D representation of objects from unposed multi-view 2D mask supervision, then uses it to generate detailed mask and image texture. In contrast to existing voxel-based methods for unposed object reconstruction, our approach learns to represent the generated shape and pose with a set of self-supervised canonical 3D anisotropic Gaussians via a perspective… ▽ More

    Submitted 24 June, 2021; originally announced June 2021.

  14. arXiv:2008.05413  [pdf, other

    cs.CV

    Look here! A parametric learning based approach to redirect visual attention

    Authors: Youssef Alami Mejjati, Celso F. Gomez, Kwang In Kim, Eli Shechtman, Zoya Bylinskii

    Abstract: Across photography, marketing, and website design, being able to direct the viewer's attention is a powerful tool. Motivated by professional workflows, we introduce an automatic method to make an image region more attention-capturing via subtle image edits that maintain realism and fidelity to the original. From an input image and a user-provided mask, our GazeShiftNet model predicts a distinct se… ▽ More

    Submitted 12 August, 2020; originally announced August 2020.

    Comments: To appear in ECCV 2020

  15. arXiv:2007.08012  [pdf, other

    cs.LG cs.CV

    Combining Task Predictors via Enhancing Joint Predictability

    Authors: Kwang In Kim, Christian Richardt, Hyung Jin Chang

    Abstract: Predictor combination aims to improve a (target) predictor of a learning task based on the (reference) predictors of potentially relevant tasks, without having access to the internals of individual predictors. We present a new predictor combination algorithm that improves the target by i) measuring the relevance of references based on their capabilities in predicting the target, and ii) strengthen… ▽ More

    Submitted 15 July, 2020; originally announced July 2020.

  16. arXiv:2004.07788  [pdf, other

    cs.CV

    RGBD-Dog: Predicting Canine Pose from RGBD Sensors

    Authors: Sinead Kearney, Wenbin Li, Martin Parsons, Kwang In Kim, Darren Cosker

    Abstract: The automatic extraction of animal \reb{3D} pose from images without markers is of interest in a range of scientific fields. Most work to date predicts animal pose from RGB images, based on 2D labelling of joint positions. However, due to the difficult nature of obtaining training data, no ground truth dataset of 3D animal motion is available to quantitatively evaluate these approaches. In additio… ▽ More

    Submitted 16 April, 2020; originally announced April 2020.

    Comments: 18 pages, 16 figures, to be published in CVPR 2020

  17. arXiv:2002.04709  [pdf, other

    cs.LG stat.ML

    Task-Aware Variational Adversarial Active Learning

    Authors: Kwanyoung Kim, Dongwon Park, Kwang In Kim, Se Young Chun

    Abstract: Often, labeling large amount of data is challenging due to high labeling cost limiting the application domain of deep learning techniques. Active learning (AL) tackles this by querying the most informative samples to be annotated among unlabeled pool. Two promising directions for AL that have been recently explored are task-agnostic approach to select data points that are far from the current labe… ▽ More

    Submitted 8 December, 2020; v1 submitted 11 February, 2020; originally announced February 2020.

    Comments: 14 pages, 13 figures, 1 table

  18. arXiv:2001.02595  [pdf, other

    cs.CV eess.IV

    Generating Object Stamps

    Authors: Youssef Alami Mejjati, Zejiang Shen, Michael Snower, Aaron Gokaslan, Oliver Wang, James Tompkin, Kwang In Kim

    Abstract: We present an algorithm to generate diverse foreground objects and composite them into background images using a GAN architecture. Given an object class, a user-provided bounding box, and a background image, we first use a mask generator to create an object shape, and then use a texture generator to fill the mask such that the texture integrates with the background. By separating the problem of ob… ▽ More

    Submitted 10 January, 2020; v1 submitted 1 January, 2020; originally announced January 2020.

    Comments: 27 pages, 25 figures, 11 tables. Paper under review

  19. arXiv:1905.04967  [pdf, other

    cs.LG cs.CV stat.ML

    Implicit Filter Sparsification In Convolutional Neural Networks

    Authors: Dushyant Mehta, Kwang In Kim, Christian Theobalt

    Abstract: We show implicit filter level sparsity manifests in convolutional neural networks (CNNs) which employ Batch Normalization and ReLU activation, and are trained with adaptive gradient descent techniques and L2 regularization or weight decay. Through an extensive empirical study (Mehta et al., 2019) we hypothesize the mechanism behind the sparsification process, and find surprising links to certain f… ▽ More

    Submitted 13 May, 2019; originally announced May 2019.

    Comments: ODML-CDNNR 2019 (ICML'19 workshop) extended abstract of the CVPR 2019 paper "On Implicit Filter Level Sparsity in Convolutional Neural Networks, Mehta et al." (arXiv:1811.12495)

  20. arXiv:1904.05159  [pdf, other

    cs.CV

    Joint Manifold Diffusion for Combining Predictions on Decoupled Observations

    Authors: Kwang In Kim, Hyung Jin Chang

    Abstract: We present a new predictor combination algorithm that improves a given task predictor based on potentially relevant reference predictors. Existing approaches are limited in that, to discover the underlying task dependence, they either require known parametric forms of all predictors or access to a single fixed dataset on which all predictors are jointly evaluated. To overcome these limitations, we… ▽ More

    Submitted 10 April, 2019; originally announced April 2019.

    Comments: Published at CVPR 2019

  21. arXiv:1904.04196  [pdf, other

    cs.CV

    Pushing the Envelope for RGB-based Dense 3D Hand Pose Estimation via Neural Rendering

    Authors: Seungryul Baek, Kwang In Kim, Tae-Kyun Kim

    Abstract: Estimating 3D hand meshes from single RGB images is challenging, due to intrinsic 2D-3D mapping ambiguities and limited training data. We adopt a compact parametric 3D hand model that represents deformable and articulated hand meshes. To achieve the model fitting to RGB images, we investigate and contribute in three ways: 1) Neural rendering: inspired by recent work on human body, our hand mesh es… ▽ More

    Submitted 9 April, 2019; v1 submitted 8 April, 2019; originally announced April 2019.

    Comments: Accepted to CVPR 2019

  22. arXiv:1811.12495  [pdf, other

    cs.LG cs.CV eess.SP stat.ML

    On Implicit Filter Level Sparsity in Convolutional Neural Networks

    Authors: Dushyant Mehta, Kwang In Kim, Christian Theobalt

    Abstract: We investigate filter level sparsity that emerges in convolutional neural networks (CNNs) which employ Batch Normalization and ReLU activation, and are trained with adaptive gradient descent techniques and L2 regularization or weight decay. We conduct an extensive experimental study casting our initial findings into hypotheses and conclusions about the mechanisms underlying the emergent filter lev… ▽ More

    Submitted 5 April, 2019; v1 submitted 29 November, 2018; originally announced November 2018.

    Comments: Accepted at CVPR 2019

  23. arXiv:1808.04325  [pdf, other

    cs.CV

    Improving Shape Deformation in Unsupervised Image-to-Image Translation

    Authors: Aaron Gokaslan, Vivek Ramanujan, Daniel Ritchie, Kwang In Kim, James Tompkin

    Abstract: Unsupervised image-to-image translation techniques are able to map local texture between two domains, but they are typically unsuccessful when the domains require larger shape change. Inspired by semantic segmentation, we introduce a discriminator with dilated convolutions that is able to use information from across the entire image to train a more context-aware generator. This is coupled with a m… ▽ More

    Submitted 17 January, 2019; v1 submitted 13 August, 2018; originally announced August 2018.

  24. arXiv:1806.03891  [pdf, other

    cs.CV

    Multi-Task Deep Networks for Depth-Based 6D Object Pose and Joint Registration in Crowd Scenarios

    Authors: Juil Sock, Kwang In Kim, Caner Sahin, Tae-Kyun Kim

    Abstract: In bin-picking scenarios, multiple instances of an object of interest are stacked in a pile randomly, and hence, the instances are inherently subjected to the challenges: severe occlusion, clutter, and similar-looking distractors. Most existing methods are, however, for single isolated object instances, while some recent methods tackle crowd scenarios as post-refinement which accounts multiple obj… ▽ More

    Submitted 11 June, 2018; originally announced June 2018.

  25. arXiv:1806.02311  [pdf, other

    cs.CV cs.AI

    Unsupervised Attention-guided Image to Image Translation

    Authors: Youssef A. Mejjati, Christian Richardt, James Tompkin, Darren Cosker, Kwang In Kim

    Abstract: Current unsupervised image-to-image translation techniques struggle to focus their attention on individual objects without altering the background or the way multiple objects interact within a scene. Motivated by the important role of attention in human perception, we tackle this limitation by introducing unsupervised attention mechanisms that are jointly adversarialy trained with the generators a… ▽ More

    Submitted 8 November, 2018; v1 submitted 6 June, 2018; originally announced June 2018.

    Journal ref: NIPS 2018

  26. arXiv:1805.04497  [pdf, other

    cs.CV

    Augmented Skeleton Space Transfer for Depth-based Hand Pose Estimation

    Authors: Seungryul Baek, Kwang In Kim, Tae-Kyun Kim

    Abstract: Crucial to the success of training a depth-based 3D hand pose estimator (HPE) is the availability of comprehensive datasets covering diverse camera perspectives, shapes, and pose variations. However, collecting such annotated datasets is challenging. We propose to complete existing databases by generating new database entries. The key idea is to synthesize data in the skeleton space (instead of do… ▽ More

    Submitted 11 May, 2018; originally announced May 2018.

    Comments: Accepted to CVPR 2018

  27. arXiv:1804.04082  [pdf, other

    cs.CV

    Ranking CGANs: Subjective Control over Semantic Image Attributes

    Authors: Yassir Saquil, Kwang In Kim, Peter Hall

    Abstract: In this paper, we investigate the use of generative adversarial networks in the task of image generation according to subjective measures of semantic attributes. Unlike the standard (CGAN) that generates images from discrete categorical labels, our architecture handles both continuous and discrete scales. Given pairwise comparisons of images, our model, called RankCGAN, performs two tasks: it lear… ▽ More

    Submitted 24 July, 2018; v1 submitted 11 April, 2018; originally announced April 2018.

  28. arXiv:1706.03863  [pdf, other

    cs.CV

    Criteria Sliders: Learning Continuous Database Criteria via Interactive Ranking

    Authors: James Tompkin, Kwang In Kim, Hanspeter Pfister, Christian Theobalt

    Abstract: Large databases are often organized by hand-labeled metadata, or criteria, which are expensive to collect. We can use unsupervised learning to model database variation, but these models are often high dimensional, complex to parameterize, or require expert knowledge. We learn low-dimensional continuous criteria via interactive ranking, so that the novice user need only describe the relative orderi… ▽ More

    Submitted 12 June, 2017; originally announced June 2017.

  29. arXiv:1706.02003  [pdf, other

    cs.CV

    Deep Convolutional Decision Jungle for Image Classification

    Authors: Seungryul Baek, Kwang In Kim, Tae-Kyun Kim

    Abstract: We propose a novel method called deep convolutional decision jungle (CDJ) and its learning algorithm for image classification. The CDJ maintains the structure of standard convolutional neural networks (CNNs), i.e. multiple layers of multiple response maps fully connected. Each response map-or node-in both the convolutional and fully-connected layers selectively respond to class labels s.t. each da… ▽ More

    Submitted 18 May, 2018; v1 submitted 6 June, 2017; originally announced June 2017.

  30. arXiv:1610.09334  [pdf, ps, other

    cs.CV

    Real-time Online Action Detection Forests using Spatio-temporal Contexts

    Authors: Seungryul Baek, Kwang In Kim, Tae-Kyun Kim

    Abstract: Online action detection (OAD) is challenging since 1) robust yet computationally expensive features cannot be straightforwardly used due to the real-time processing requirements and 2) the localization and classification of actions have to be performed even before they are fully observed. We propose a new random forest (RF)-based online action detection framework that addresses these challenges. O… ▽ More

    Submitted 28 October, 2016; originally announced October 2016.

  31. Context-guided diffusion for label propagation on graphs

    Authors: Kwang In Kim, James Tompkin, Hanspeter Pfister, Christian Theobalt

    Abstract: Existing approaches for diffusion on graphs, e.g., for label propagation, are mainly focused on isotropic diffusion, which is induced by the commonly-used graph Laplacian regularizer. Inspired by the success of diffusivity tensors for anisotropic diffusion in image processing, we presents anisotropic diffusion on graphs and the corresponding label propagation algorithm. We develop positive definit… ▽ More

    Submitted 20 February, 2016; originally announced February 2016.

  32. Semi-supervised Learning with Explicit Relationship Regularization

    Authors: Kwang In Kim, James Tompkin, Hanspeter Pfister, Christian Theobalt

    Abstract: In many learning tasks, the structure of the target space of a function holds rich information about the relationships between evaluations of functions on different data points. Existing approaches attempt to exploit this relationship information implicitly by enforcing smoothness on function evaluations only. However, what happens if we explicitly regularize the relationships between function eva… ▽ More

    Submitted 11 February, 2016; originally announced February 2016.

    Comments: Accepted version of paper published at CVPR 2015, http://dx.doi.org/10.1109/CVPR.2015.7298831

  33. Local High-order Regularization on Data Manifolds

    Authors: Kwang In Kim, James Tompkin, Hanspeter Pfister, Christian Theobalt

    Abstract: The common graph Laplacian regularizer is well-established in semi-supervised learning and spectral dimensionality reduction. However, as a first-order regularizer, it can lead to degenerate functions in high-dimensional manifolds. The iterated graph Laplacian enables high-order regularization, but it has a high computational complexity and so cannot be applied to large problems. We introduce a ne… ▽ More

    Submitted 11 February, 2016; originally announced February 2016.

    Comments: Accepted version of paper published at CVPR 2015, http://dx.doi.org/10.1109/CVPR.2015.7299186