-
Pseudo-random sequences for low-cost operando impedance measurements of Li-ion batteries
Authors:
Jussi Sihvo,
Noël Hallemans,
Ai Hui Tan,
David A. Howey,
Stephen. R. Duncan,
Tomi Roinila
Abstract:
Operando impedance measurements are promising for monitoring batteries in the field. In this work, we present pseudo-random sequences for low-cost operando battery impedance measurements. The quadratic-residue ternary sequence and direct-synthesis ternary sequence exhibit specific properties related to eigenvectors of the discrete Fourier transform matrix that allow computationally efficient compe…
▽ More
Operando impedance measurements are promising for monitoring batteries in the field. In this work, we present pseudo-random sequences for low-cost operando battery impedance measurements. The quadratic-residue ternary sequence and direct-synthesis ternary sequence exhibit specific properties related to eigenvectors of the discrete Fourier transform matrix that allow computationally efficient compensation for drifts and transients in operando impedance measurements. We describe the application of pseudo-random sequences and provide the data processing required to suppress drift and transients, validated on simulations. Finally, we perform experimental operando impedance measurements on a Li-ion battery cell during fast-charging, demonstrating the applicability of the proposed method. It's low-cost hardware requirements, fast measurements, and simple data-processing make the method practical for embedding in battery management systems.
△ Less
Submitted 9 June, 2025;
originally announced June 2025.
-
Mobile Robot Navigation Using Hand-Drawn Maps: A Vision Language Model Approach
Authors:
Aaron Hao Tan,
Angus Fung,
Haitong Wang,
Goldie Nejat
Abstract:
Hand-drawn maps can be used to convey navigation instructions between humans and robots in a natural and efficient manner. However, these maps can often contain inaccuracies such as scale distortions and missing landmarks which present challenges for mobile robot navigation. This paper introduces a novel Hand-drawn Map Navigation (HAM-Nav) architecture that leverages pre-trained vision language mo…
▽ More
Hand-drawn maps can be used to convey navigation instructions between humans and robots in a natural and efficient manner. However, these maps can often contain inaccuracies such as scale distortions and missing landmarks which present challenges for mobile robot navigation. This paper introduces a novel Hand-drawn Map Navigation (HAM-Nav) architecture that leverages pre-trained vision language models (VLMs) for robot navigation across diverse environments, hand-drawing styles, and robot embodiments, even in the presence of map inaccuracies. HAM-Nav integrates a unique Selective Visual Association Prompting approach for topological map-based position estimation and navigation planning as well as a Predictive Navigation Plan Parser to infer missing landmarks. Extensive experiments were conducted in photorealistic simulated environments, using both wheeled and legged robots, demonstrating the effectiveness of HAM-Nav in terms of navigation success rates and Success weighted by Path Length. Furthermore, a user study in real-world environments highlighted the practical utility of hand-drawn maps for robot navigation as well as successful navigation outcomes compared against a non-hand-drawn map approach.
△ Less
Submitted 28 April, 2025; v1 submitted 31 January, 2025;
originally announced February 2025.
-
MLLM-Search: A Zero-Shot Approach to Finding People using Multimodal Large Language Models
Authors:
Angus Fung,
Aaron Hao Tan,
Haitong Wang,
Beno Benhabib,
Goldie Nejat
Abstract:
Robotic search of people in human-centered environments, including healthcare settings, is challenging as autonomous robots need to locate people without complete or any prior knowledge of their schedules, plans or locations. Furthermore, robots need to be able to adapt to real-time events that can influence a person's plan in an environment. In this paper, we present MLLM-Search, a novel zero-sho…
▽ More
Robotic search of people in human-centered environments, including healthcare settings, is challenging as autonomous robots need to locate people without complete or any prior knowledge of their schedules, plans or locations. Furthermore, robots need to be able to adapt to real-time events that can influence a person's plan in an environment. In this paper, we present MLLM-Search, a novel zero-shot person search architecture that leverages multimodal large language models (MLLM) to address the mobile robot problem of searching for a person under event-driven scenarios with varying user schedules. Our approach introduces a novel visual prompting method to provide robots with spatial understanding of the environment by generating a spatially grounded waypoint map, representing navigable waypoints by a topological graph and regions by semantic labels. This is incorporated into a MLLM with a region planner that selects the next search region based on the semantic relevance to the search scenario, and a waypoint planner which generates a search path by considering the semantically relevant objects and the local spatial context through our unique spatial chain-of-thought prompting approach. Extensive 3D photorealistic experiments were conducted to validate the performance of MLLM-Search in searching for a person with a changing schedule in different environments. An ablation study was also conducted to validate the main design choices of MLLM-Search. Furthermore, a comparison study with state-of-the art search methods demonstrated that MLLM-Search outperforms existing methods with respect to search efficiency. Real-world experiments with a mobile robot in a multi-room floor of a building showed that MLLM-Search was able to generalize to finding a person in a new unseen environment.
△ Less
Submitted 27 November, 2024;
originally announced December 2024.
-
Find Everything: A General Vision Language Model Approach to Multi-Object Search
Authors:
Daniel Choi,
Angus Fung,
Haitong Wang,
Aaron Hao Tan
Abstract:
The Multi-Object Search (MOS) problem involves navigating to a sequence of locations to maximize the likelihood of finding target objects while minimizing travel costs. In this paper, we introduce a novel approach to the MOS problem, called Finder, which leverages vision language models (VLMs) to locate multiple objects across diverse environments. Specifically, our approach introduces multi-chann…
▽ More
The Multi-Object Search (MOS) problem involves navigating to a sequence of locations to maximize the likelihood of finding target objects while minimizing travel costs. In this paper, we introduce a novel approach to the MOS problem, called Finder, which leverages vision language models (VLMs) to locate multiple objects across diverse environments. Specifically, our approach introduces multi-channel score maps to track and reason about multiple objects simultaneously during navigation, along with a score map technique that combines scene-level and object-level semantic correlations. Experiments in both simulated and real-world settings showed that Finder outperforms existing methods using deep reinforcement learning and VLMs. Ablation and scalability studies further validated our design choices and robustness with increasing numbers of target objects, respectively. Website: https://find-all-my-things.github.io/
△ Less
Submitted 1 March, 2025; v1 submitted 1 October, 2024;
originally announced October 2024.
-
OLiVia-Nav: An Online Lifelong Vision Language Approach for Mobile Robot Social Navigation
Authors:
Siddarth Narasimhan,
Aaron Hao Tan,
Daniel Choi,
Goldie Nejat
Abstract:
Service robots in human-centered environments such as hospitals, office buildings, and long-term care homes need to navigate while adhering to social norms to ensure the safety and comfortability of the people they are sharing the space with. Furthermore, they need to adapt to new social scenarios that can arise during robot navigation. In this paper, we present a novel Online Lifelong Vision Lang…
▽ More
Service robots in human-centered environments such as hospitals, office buildings, and long-term care homes need to navigate while adhering to social norms to ensure the safety and comfortability of the people they are sharing the space with. Furthermore, they need to adapt to new social scenarios that can arise during robot navigation. In this paper, we present a novel Online Lifelong Vision Language architecture, OLiVia- Nav, which uniquely integrates vision-language models (VLMs) with an online lifelong learning framework for robot social navigation. We introduce a unique distillation approach, Social Context Contrastive Language Image Pre-training (SC-CLIP), to transfer the social reasoning capabilities of large VLMs to a lightweight VLM, in order for OLiVia-Nav to directly encode social and environment context during robot navigation. These encoded embeddings are used to generate and select robot social compliant trajectories. The lifelong learning capabilities of SC-CLIP enable OLiVia-Nav to update the robot trajectory planning overtime as new social scenarios are encountered. We conducted extensive real-world experiments in diverse social navigation scenarios. The results showed that OLiVia-Nav outperformed existing state-of-the-art DRL and VLM methods in terms of mean squared error, Hausdorff loss, and personal space violation duration. Ablation studies also verified the design choices for OLiVia-Nav.
△ Less
Submitted 8 March, 2025; v1 submitted 20 September, 2024;
originally announced September 2024.
-
4CNet: A Diffusion Approach to Map Prediction for Decentralized Multi-Robot Exploration
Authors:
Aaron Hao Tan,
Siddarth Narasimhan,
Goldie Nejat
Abstract:
Mobile robots in unknown cluttered environments with irregularly shaped obstacles often face energy and communication challenges which directly affect their ability to explore these environments. In this paper, we introduce a novel deep learning architecture, Confidence-Aware Contrastive Conditional Consistency Model (4CNet), for robot map prediction during decentralized, resource-limited multi-ro…
▽ More
Mobile robots in unknown cluttered environments with irregularly shaped obstacles often face energy and communication challenges which directly affect their ability to explore these environments. In this paper, we introduce a novel deep learning architecture, Confidence-Aware Contrastive Conditional Consistency Model (4CNet), for robot map prediction during decentralized, resource-limited multi-robot exploration. 4CNet uniquely incorporates: 1) a conditional consistency model for map prediction in unstructured unknown regions, 2) a contrastive map-trajectory pretraining framework for a trajectory encoder that extracts spatial information from the trajectories of nearby robots during map prediction, and 3) a confidence network to measure the uncertainty of map prediction for effective exploration under resource constraints. We incorporate 4CNet within our proposed robot exploration with map prediction architecture, 4CNet-E. We then conduct extensive comparison studies with 4CNet-E and state-of-the-art heuristic and learning methods to investigate both map prediction and exploration performance in environments consisting of irregularly shaped obstacles and uneven terrain. Results showed that 4CNet-E obtained statistically significant higher prediction accuracy and area coverage with varying environment sizes, number of robots, energy budgets, and communication limitations when compared to database and learning-based methods. Hardware experiments were performed and validated the applicability and generalizability of 4CNet-E in both unstructured indoor and real natural outdoor environments.
△ Less
Submitted 8 April, 2025; v1 submitted 27 February, 2024;
originally announced February 2024.
-
NavFormer: A Transformer Architecture for Robot Target-Driven Navigation in Unknown and Dynamic Environments
Authors:
Haitong Wang,
Aaron Hao Tan,
Goldie Nejat
Abstract:
In unknown cluttered and dynamic environments such as disaster scenes, mobile robots need to perform target-driven navigation in order to find people or objects of interest, while being solely guided by images of the targets. In this paper, we introduce NavFormer, a novel end-to-end transformer architecture developed for robot target-driven navigation in unknown and dynamic environments. NavFormer…
▽ More
In unknown cluttered and dynamic environments such as disaster scenes, mobile robots need to perform target-driven navigation in order to find people or objects of interest, while being solely guided by images of the targets. In this paper, we introduce NavFormer, a novel end-to-end transformer architecture developed for robot target-driven navigation in unknown and dynamic environments. NavFormer leverages the strengths of both 1) transformers for sequential data processing and 2) self-supervised learning (SSL) for visual representation to reason about spatial layouts and to perform collision-avoidance in dynamic settings. The architecture uniquely combines dual-visual encoders consisting of a static encoder for extracting invariant environment features for spatial reasoning, and a general encoder for dynamic obstacle avoidance. The primary robot navigation task is decomposed into two sub-tasks for training: single robot exploration and multi-robot collision avoidance. We perform cross-task training to enable the transfer of learned skills to the complex primary navigation task without the need for task-specific fine-tuning. Simulated experiments demonstrate that NavFormer can effectively navigate a mobile robot in diverse unknown environments, outperforming existing state-of-the-art methods in terms of success rate and success weighted by (normalized inverse) path length. Furthermore, a comprehensive ablation study is performed to evaluate the impact of the main design choices of the structure and training of NavFormer, further validating their effectiveness in the overall system.
△ Less
Submitted 8 July, 2024; v1 submitted 9 February, 2024;
originally announced February 2024.
-
Deep Reinforcement Learning for Decentralized Multi-Robot Exploration With Macro Actions
Authors:
Aaron Hao Tan,
Federico Pizarro Bejarano,
Yuhan Zhu,
Richard Ren,
Goldie Nejat
Abstract:
Cooperative multi-robot teams need to be able to explore cluttered and unstructured environments while dealing with communication dropouts that prevent them from exchanging local information to maintain team coordination. Therefore, robots need to consider high-level teammate intentions during action selection. In this letter, we present the first Macro Action Decentralized Exploration Network (MA…
▽ More
Cooperative multi-robot teams need to be able to explore cluttered and unstructured environments while dealing with communication dropouts that prevent them from exchanging local information to maintain team coordination. Therefore, robots need to consider high-level teammate intentions during action selection. In this letter, we present the first Macro Action Decentralized Exploration Network (MADE-Net) using multi-agent deep reinforcement learning (DRL) to address the challenges of communication dropouts during multi-robot exploration in unseen, unstructured, and cluttered environments. Simulated robot team exploration experiments were conducted and compared against classical and DRL methods where MADE-Net outperformed all benchmark methods in terms of computation time, total travel distance, number of local interactions between robots, and exploration rate across various degrees of communication dropouts. A scalability study in 3D environments showed a decrease in exploration time with MADE-Net with increasing team and environment sizes. The experiments presented highlight the effectiveness and robustness of our method.
△ Less
Submitted 26 February, 2024; v1 submitted 5 October, 2021;
originally announced October 2021.
-
Meme as Building Block for Evolutionary Optimization of Problem Instances
Authors:
Liang Feng,
Yew Soon Ong,
Ah Hwee Tan,
Ivor Wai-Hung Tsang
Abstract:
A significantly under-explored area of evolutionary optimization in the literature is the study of optimization methodologies that can evolve along with the problems solved. Particularly, present evolutionary optimization approaches generally start their search from scratch or the ground-zero state of knowledge, independent of how similar the given new problem of interest is to those optimized pre…
▽ More
A significantly under-explored area of evolutionary optimization in the literature is the study of optimization methodologies that can evolve along with the problems solved. Particularly, present evolutionary optimization approaches generally start their search from scratch or the ground-zero state of knowledge, independent of how similar the given new problem of interest is to those optimized previously. There has thus been the apparent lack of automated knowledge transfers and reuse across problems. Taking the cue, this paper introduces a novel Memetic Computational Paradigm for search, one that models after how human solves problems, and embarks on a study towards intelligent evolutionary optimization of problems through the transfers of structured knowledge in the form of memes learned from previous problem-solving experiences, to enhance future evolutionary searches. In particular, the proposed memetic search paradigm is composed of four culture-inspired operators, namely, Meme Learning, Meme Selection, Meme Variation and Meme Imitation. The learning operator mines for memes in the form of latent structures derived from past experiences of problem-solving. The selection operator identifies the fit memes that replicate and transmit across problems, while the variation operator introduces innovations into the memes. The imitation operator, on the other hand, defines how fit memes assimilate into the search process of newly encountered problems, thus gearing towards efficient and effective evolutionary optimization. Finally, comprehensive studies on two widely studied challenging well established NP-hard routing problem domains, particularly, the capacitated vehicle routing (CVR) and capacitated arc routing (CAR), confirm the high efficacy of the proposed memetic computational search paradigm for intelligent evolutionary optimization of problems.
△ Less
Submitted 3 July, 2012;
originally announced July 2012.