-
Accelerating Battery Material Optimization through iterative Machine Learning
Authors:
Seon-Hwa Lee,
Insoo Ye,
Changhwan Lee,
Jieun Kim,
Geunho Choi,
Sang-Cheol Nam,
Inchul Park
Abstract:
The performance of battery materials is determined by their composition and the processing conditions employed during commercial-scale fabrication, where raw materials undergo complex processing steps with various additives to yield final products. As the complexity of these parameters expands with the development of industry, conventional one-factor-at-a-time (OFAT) experiment becomes old fashion…
▽ More
The performance of battery materials is determined by their composition and the processing conditions employed during commercial-scale fabrication, where raw materials undergo complex processing steps with various additives to yield final products. As the complexity of these parameters expands with the development of industry, conventional one-factor-at-a-time (OFAT) experiment becomes old fashioned. While domain expertise aids in parameter optimization, this traditional approach becomes increasingly vulnerable to cognitive limitations and anthropogenic biases as the complexity of factors grows. Herein, we introduce an iterative machine learning (ML) framework that integrates active learning to guide targeted experimentation and facilitate incremental model refinement. This method systematically leverages comprehensive experimental observations, including both successful and unsuccessful results, effectively mitigating human-induced biases and alleviating data scarcity. Consequently, it significantly accelerates exploration within the high-dimensional design space. Our results demonstrate that active-learning-driven experimentation markedly reduces the total number of experimental cycles necessary, underscoring the transformative potential of ML-based strategies in expediting battery material optimization.
△ Less
Submitted 12 May, 2025;
originally announced May 2025.
-
Performance of a Threshold-based WDM and ACM for FSO Communication between Mobile Platforms in Maritime Environments
Authors:
Jae-Eun Han,
Sung Sik Nam,
Duck Dong Hwang,
Mohamed-Slim Alouini
Abstract:
In this study, we statistically analyze the performance of a threshold-based multiple optical signal selection scheme (TMOS) for wavelength division multiplexing (WDM) and adaptive coded modulation (ACM) using free space optical (FSO) communication between mobile platforms in maritime environments with fog and 3D pointing errors. Specifically, we derive a new closed-form expression for a composite…
▽ More
In this study, we statistically analyze the performance of a threshold-based multiple optical signal selection scheme (TMOS) for wavelength division multiplexing (WDM) and adaptive coded modulation (ACM) using free space optical (FSO) communication between mobile platforms in maritime environments with fog and 3D pointing errors. Specifically, we derive a new closed-form expression for a composite probability density function (PDF) that is more appropriate for applying various algorithms to FSO systems under the combined effects of fog and pointing errors. We then analyze the outage probability, average spectral efficiency (ASE), and bit error rate (BER) performance of the conventional detection techniques (i.e., heterodyne and intensity modulation/direct detection). The derived analytical results were cross-verified using Monte Carlo simulations. The results show that we can obtain a higher ASE performance by applying TMOS-based WDM and ACM and that the probability of the beam being detected in the photodetector increased at a low signal-to-noise ratio, contrary to conventional performance. Furthermore, it has been confirmed that applying WDM and ACM is suitable, particularly in maritime environments where channel conditions frequently change.
△ Less
Submitted 14 October, 2024;
originally announced October 2024.
-
FH-DRL: Exponential-Hyperbolic Frontier Heuristics with DRL for accelerated Exploration in Unknown Environments
Authors:
Seunghyeop Nam,
Tuan Anh Nguyen,
Eunmi Choi,
Dugki Min
Abstract:
Autonomous robot exploration in large-scale or cluttered environments remains a central challenge in intelligent vehicle applications, where partial or absent prior maps constrain reliable navigation. This paper introduces FH-DRL, a novel framework that integrates a customizable heuristic function for frontier detection with a Twin Delayed DDPG (TD3) agent for continuous, high-speed local navigati…
▽ More
Autonomous robot exploration in large-scale or cluttered environments remains a central challenge in intelligent vehicle applications, where partial or absent prior maps constrain reliable navigation. This paper introduces FH-DRL, a novel framework that integrates a customizable heuristic function for frontier detection with a Twin Delayed DDPG (TD3) agent for continuous, high-speed local navigation. The proposed heuristic relies on an exponential-hyperbolic distance score, which balances immediate proximity against long-range exploration gains, and an occupancy-based stochastic measure, accounting for environmental openness and obstacle densities in real time. By ranking frontiers using these adaptive metrics, FH-DRL targets highly informative yet tractable waypoints, thereby minimizing redundant paths and total exploration time. We thoroughly evaluate FH-DRL across multiple simulated and real-world scenarios, demonstrating clear improvements in travel distance and completion time over frontier-only or purely DRL-based exploration. In structured corridor layouts and maze-like topologies, our architecture consistently outperforms standard methods such as Nearest Frontier, Cognet Frontier Exploration, and Goal Driven Autonomous Exploration. Real-world tests with a Turtlebot3 platform further confirm robust adaptation to previously unseen or cluttered indoor spaces. The results highlight FH-DRL as an efficient and generalizable approach for frontier-based exploration in large or partially known environments, offering a promising direction for various autonomous driving, industrial, and service robotics tasks.
△ Less
Submitted 12 February, 2025; v1 submitted 26 July, 2024;
originally announced July 2024.
-
Freq-Mip-AA : Frequency Mip Representation for Anti-Aliasing Neural Radiance Fields
Authors:
Youngin Park,
Seungtae Nam,
Cheul-hee Hahm,
Eunbyung Park
Abstract:
Neural Radiance Fields (NeRF) have shown remarkable success in representing 3D scenes and generating novel views. However, they often struggle with aliasing artifacts, especially when rendering images from different camera distances from the training views. To address the issue, Mip-NeRF proposed using volumetric frustums to render a pixel and suggested integrated positional encoding (IPE). While…
▽ More
Neural Radiance Fields (NeRF) have shown remarkable success in representing 3D scenes and generating novel views. However, they often struggle with aliasing artifacts, especially when rendering images from different camera distances from the training views. To address the issue, Mip-NeRF proposed using volumetric frustums to render a pixel and suggested integrated positional encoding (IPE). While effective, this approach requires long training times due to its reliance on MLP architecture. In this work, we propose a novel anti-aliasing technique that utilizes grid-based representations, usually showing significantly faster training time. In addition, we exploit frequency-domain representation to handle the aliasing problem inspired by the sampling theorem. The proposed method, FreqMipAA, utilizes scale-specific low-pass filtering (LPF) and learnable frequency masks. Scale-specific low-pass filters (LPF) prevent aliasing and prioritize important image details, and learnable masks effectively remove problematic high-frequency elements while retaining essential information. By employing a scale-specific LPF and trainable masks, FreqMipAA can effectively eliminate the aliasing factor while retaining important details. We validated the proposed technique by incorporating it into a widely used grid-based method. The experimental results have shown that the FreqMipAA effectively resolved the aliasing issues and achieved state-of-the-art results in the multi-scale Blender dataset. Our code is available at https://github.com/yi0109/FreqMipAA .
△ Less
Submitted 19 June, 2024;
originally announced June 2024.
-
Hierarchical Climate Control Strategy for Electric Vehicles with Door-Opening Consideration
Authors:
Sanghyeon Nam,
Hyejin Lee,
Youngki Kim,
Kyoung hyun Kwak,
Kyoungseok Han
Abstract:
This study proposes a novel climate control strategy for electric vehicles (EVs) by addressing door-opening interruptions, an overlooked aspect in EV thermal management. We create and validate an EV simulation model that incorporates door-opening scenarios. Three controllers are compared using the simulation model: (i) a hierarchical non-linear model predictive control (NMPC) with a unique coolant…
▽ More
This study proposes a novel climate control strategy for electric vehicles (EVs) by addressing door-opening interruptions, an overlooked aspect in EV thermal management. We create and validate an EV simulation model that incorporates door-opening scenarios. Three controllers are compared using the simulation model: (i) a hierarchical non-linear model predictive control (NMPC) with a unique coolant dividing layer and a component for cabin air inflow regulation based on door-opening signals; (ii) a single MPC controller; and (iii) a rule-based controller. The hierarchical controller outperforms, reducing door-opening temperature drops by 46.96% and 51.33% compared to single layer MPC and rule-based methods in the relevant section. Additionally, our strategy minimizes the maximum temperature gaps between the sections during recovery by 86.4% and 78.7%, surpassing single layer MPC and rule-based approaches, respectively. We believe that this result opens up future possibilities for incorporating the thermal comfort of passengers across all sections within the vehicle.
△ Less
Submitted 31 March, 2024;
originally announced April 2024.
-
Multi-Robot Relative Pose Estimation in SE(2) with Observability Analysis: A Comparison of Extended Kalman Filtering and Robust Pose Graph Optimization
Authors:
Kihoon Shin,
Hyunjae Sim,
Seungwon Nam,
Yonghee Kim,
Jae Hu,
Kwang-Ki K. Kim
Abstract:
In this study, we address multi-robot localization issues, with a specific focus on cooperative localization and observability analysis of relative pose estimation. Cooperative localization involves enhancing each robot's information through a communication network and message passing. If odometry data from a target robot can be transmitted to the ego robot, observability of their relative pose es…
▽ More
In this study, we address multi-robot localization issues, with a specific focus on cooperative localization and observability analysis of relative pose estimation. Cooperative localization involves enhancing each robot's information through a communication network and message passing. If odometry data from a target robot can be transmitted to the ego robot, observability of their relative pose estimation can be achieved through range-only or bearing-only measurements, provided both robots have non-zero linear velocities. In cases where odometry data from a target robot are not directly transmitted but estimated by the ego robot, both range and bearing measurements are necessary to ensure observability of relative pose estimation. For ROS/Gazebo simulations, we explore four sensing and communication structures. We compare extended Kalman filtering (EKF) and pose graph optimization (PGO) estimation using different robust loss functions (filtering and smoothing with varying batch sizes of sliding windows) in terms of estimation accuracy. In hardware experiments, two Turtlebot3 equipped with UWB modules are used for real-world inter-robot relative pose estimation, applying both EKF and PGO and comparing their performance.
△ Less
Submitted 4 February, 2024; v1 submitted 27 January, 2024;
originally announced January 2024.
-
Depolarized Holography with Polarization-multiplexing Metasurface
Authors:
Seung-Woo Nam,
Youngjin Kim,
Dongyeon Kim,
Yoonchan Jeong
Abstract:
The evolution of computer-generated holography (CGH) algorithms has prompted significant improvements in the performances of holographic displays. Nonetheless, they start to encounter a limited degree of freedom in CGH optimization and physical constraints stemming from the coherent nature of holograms. To surpass the physical limitations, we consider polarization as a new degree of freedom by uti…
▽ More
The evolution of computer-generated holography (CGH) algorithms has prompted significant improvements in the performances of holographic displays. Nonetheless, they start to encounter a limited degree of freedom in CGH optimization and physical constraints stemming from the coherent nature of holograms. To surpass the physical limitations, we consider polarization as a new degree of freedom by utilizing a novel optical platform called metasurface. Polarization-multiplexing metasurfaces enable incoherent-like behavior in holographic displays due to the mutual incoherence of orthogonal polarization states. We leverage this unique characteristic of a metasurface by integrating it into a holographic display and exploiting polarization diversity to bring an additional degree of freedom for CGH algorithms. To minimize the speckle noise while maximizing the image quality, we devise a fully differentiable optimization pipeline by taking into account the metasurface proxy model, thereby jointly optimizing spatial light modulator phase patterns and geometric parameters of metasurface nanostructures. We evaluate the metasurface-enabled depolarized holography through simulations and experiments, demonstrating its ability to reduce speckle noise and enhance image quality.
△ Less
Submitted 26 September, 2023;
originally announced September 2023.
-
Technical Report: Development of an Ultrahigh Bandwidth Software-defined Radio Platform
Authors:
Sung Sik Nam,
Changseok Yoon,
Ki-Hong Park,
Mohamed-Slim Alouini
Abstract:
For the development of new digital signal processing systems and services, the rapid, easy, and convenient prototyping of ideas and the rapid time-to-market of products are becoming important with advances in technology. Conventionally, for the development stage, particularly when confirming the feasibility or performance of a new system or service, an idea is first confirmed through a computerbas…
▽ More
For the development of new digital signal processing systems and services, the rapid, easy, and convenient prototyping of ideas and the rapid time-to-market of products are becoming important with advances in technology. Conventionally, for the development stage, particularly when confirming the feasibility or performance of a new system or service, an idea is first confirmed through a computerbased software simulation after developing an accurate model of the operating environment. Next, this idea is validated and tested in the real operating environment. The new systems or services and their operating environments are becoming increasingly complicated. Hence, their development processes too are more complex cost- and time-intensive tasks that require engineers with skill and professional knowledge/experience. Furthermore, for ensuring fast time-to-market, all the development processes encompassing the (i) algorithm development, (ii) product prototyping, and (iii) final product development, must be closely linked such that they can be quickly completed. In this context, the aim of this paper is to propose an ultrahigh bandwidth software-defined radio platform that can prototype a quasi-real-time operating system without a developer having sophisticated hardware/software expertise. This platform allows the realization of a software-implemented digital signal processing system in minimal time with minimal efforts and without the need of a host computer.
△ Less
Submitted 26 August, 2022; v1 submitted 25 August, 2022;
originally announced August 2022.
-
Learning sRGB-to-Raw-RGB De-rendering with Content-Aware Metadata
Authors:
Seonghyeon Nam,
Abhijith Punnappurath,
Marcus A. Brubaker,
Michael S. Brown
Abstract:
Most camera images are rendered and saved in the standard RGB (sRGB) format by the camera's hardware. Due to the in-camera photo-finishing routines, nonlinear sRGB images are undesirable for computer vision tasks that assume a direct relationship between pixel values and scene radiance. For such applications, linear raw-RGB sensor images are preferred. Saving images in their raw-RGB format is stil…
▽ More
Most camera images are rendered and saved in the standard RGB (sRGB) format by the camera's hardware. Due to the in-camera photo-finishing routines, nonlinear sRGB images are undesirable for computer vision tasks that assume a direct relationship between pixel values and scene radiance. For such applications, linear raw-RGB sensor images are preferred. Saving images in their raw-RGB format is still uncommon due to the large storage requirement and lack of support by many imaging applications. Several "raw reconstruction" methods have been proposed that utilize specialized metadata sampled from the raw-RGB image at capture time and embedded in the sRGB image. This metadata is used to parameterize a mapping function to de-render the sRGB image back to its original raw-RGB format when needed. Existing raw reconstruction methods rely on simple sampling strategies and global mapping to perform the de-rendering. This paper shows how to improve the de-rendering results by jointly learning sampling and reconstruction. Our experiments show that our learned sampling can adapt to the image content to produce better raw reconstructions than existing methods. We also describe an online fine-tuning strategy for the reconstruction network to improve results further.
△ Less
Submitted 3 June, 2022;
originally announced June 2022.
-
STRATA: Word Boundaries & Phoneme Recognition From Continuous Urdu Speech using Transfer Learning, Attention, & Data Augmentation
Authors:
Saad Naeem,
Omer Beg
Abstract:
Phoneme recognition is a largely unsolved problem in NLP, especially for low-resource languages like Urdu. The systems that try to extract the phonemes from audio speech require hand-labeled phonetic transcriptions. This requires expert linguists to annotate speech data with its relevant phonetic representation which is both an expensive and a tedious task. In this paper, we propose STRATA, a fram…
▽ More
Phoneme recognition is a largely unsolved problem in NLP, especially for low-resource languages like Urdu. The systems that try to extract the phonemes from audio speech require hand-labeled phonetic transcriptions. This requires expert linguists to annotate speech data with its relevant phonetic representation which is both an expensive and a tedious task. In this paper, we propose STRATA, a framework for supervised phoneme recognition that overcomes the data scarcity issue for low resource languages using a seq2seq neural architecture integrated with transfer learning, attention mechanism, and data augmentation. STRATA employs transfer learning to reduce the network loss in half. It uses attention mechanism for word boundaries and frame alignment detection which further reduces the network loss by 4% and is able to identify the word boundaries with 92.2% accuracy. STRATA uses various data augmentation techniques to further reduce the loss by 1.5% and is more robust towards new signals both in terms of generalization and accuracy. STRATA is able to achieve a Phoneme Error Rate of 16.5% and improves upon the state of the art by 1.1% for TIMIT dataset (English) and 11.5% for CSaLT dataset (Urdu).
△ Less
Submitted 16 April, 2022;
originally announced April 2022.
-
Learning JPEG Compression Artifacts for Image Manipulation Detection and Localization
Authors:
Myung-Joon Kwon,
Seung-Hun Nam,
In-Jae Yu,
Heung-Kyu Lee,
Changick Kim
Abstract:
Detecting and localizing image manipulation are necessary to counter malicious use of image editing techniques. Accordingly, it is essential to distinguish between authentic and tampered regions by analyzing intrinsic statistics in an image. We focus on JPEG compression artifacts left during image acquisition and editing. We propose a convolutional neural network (CNN) that uses discrete cosine tr…
▽ More
Detecting and localizing image manipulation are necessary to counter malicious use of image editing techniques. Accordingly, it is essential to distinguish between authentic and tampered regions by analyzing intrinsic statistics in an image. We focus on JPEG compression artifacts left during image acquisition and editing. We propose a convolutional neural network (CNN) that uses discrete cosine transform (DCT) coefficients, where compression artifacts remain, to localize image manipulation. Standard CNNs cannot learn the distribution of DCT coefficients because the convolution throws away the spatial coordinates, which are essential for DCT coefficients. We illustrate how to design and train a neural network that can learn the distribution of DCT coefficients. Furthermore, we introduce Compression Artifact Tracing Network (CAT-Net) that jointly uses image acquisition artifacts and compression artifacts. It significantly outperforms traditional and deep neural network-based methods in detecting and localizing tampered regions.
△ Less
Submitted 25 May, 2022; v1 submitted 29 August, 2021;
originally announced August 2021.
-
BitMix: Data Augmentation for Image Steganalysis
Authors:
In-Jae Yu,
Wonhyuk Ahn,
Seung-Hun Nam,
Heung-Kyu Lee
Abstract:
Convolutional neural networks (CNN) for image steganalysis demonstrate better performances with employing concepts from high-level vision tasks. The major employed concept is to use data augmentation to avoid overfitting due to limited data. To augment data without damaging the message embedding, only rotating multiples of 90 degrees or horizontally flipping are used in steganalysis, which generat…
▽ More
Convolutional neural networks (CNN) for image steganalysis demonstrate better performances with employing concepts from high-level vision tasks. The major employed concept is to use data augmentation to avoid overfitting due to limited data. To augment data without damaging the message embedding, only rotating multiples of 90 degrees or horizontally flipping are used in steganalysis, which generates eight fixed results from one sample. To overcome this limitation, we propose BitMix, a data augmentation method for spatial image steganalysis. BitMix mixes a cover and stego image pair by swapping the random patch and generates an embedding adaptive label with the ratio of the number of pixels modified in the swapped patch to those in the cover-stego pair. We explore optimal hyperparameters, the ratio of applying BitMix in the mini-batch, and the size of the bounding box for swapping patch. The results reveal that using BitMix improves the performance of spatial image steganalysis and better than other data augmentation methods.
△ Less
Submitted 30 June, 2020;
originally announced June 2020.
-
Learning the Loss Functions in a Discriminative Space for Video Restoration
Authors:
Younghyun Jo,
Jaeyeon Kang,
Seoung Wug Oh,
Seonghyeon Nam,
Peter Vajda,
Seon Joo Kim
Abstract:
With more advanced deep network architectures and learning schemes such as GANs, the performance of video restoration algorithms has greatly improved recently. Meanwhile, the loss functions for optimizing deep neural networks remain relatively unchanged. To this end, we propose a new framework for building effective loss functions by learning a discriminative space specific to a video restoration…
▽ More
With more advanced deep network architectures and learning schemes such as GANs, the performance of video restoration algorithms has greatly improved recently. Meanwhile, the loss functions for optimizing deep neural networks remain relatively unchanged. To this end, we propose a new framework for building effective loss functions by learning a discriminative space specific to a video restoration task. Our framework is similar to GANs in that we iteratively train two networks - a generator and a loss network. The generator learns to restore videos in a supervised fashion, by following ground truth features through the feature matching in the discriminative space learned by the loss network. In addition, we also introduce a new relation loss in order to maintain the temporal consistency in output videos. Experiments on video superresolution and deblurring show that our method generates visually more pleasing videos with better quantitative perceptual metric values than the other state-of-the-art methods.
△ Less
Submitted 20 March, 2020;
originally announced March 2020.