Skip to main content

Showing 1–50 of 94 results for author: Le, C

Searching in archive cs. Search in all archives.
.
  1. arXiv:2507.01340  [pdf, ps, other

    cs.CV

    Physics-informed Ground Reaction Dynamics from Human Motion Capture

    Authors: Cuong Le, Huy-Phuong Le, Duc Le, Minh-Thien Duong, Van-Binh Nguyen, My-Ha Le

    Abstract: Body dynamics are crucial information for the analysis of human motions in important research fields, ranging from biomechanics, sports science to computer vision and graphics. Modern approaches collect the body dynamics, external reactive force specifically, via force plates, synchronizing with human motion capture data, and learn to estimate the dynamics from a black-box deep learning model. Bei… ▽ More

    Submitted 2 July, 2025; originally announced July 2025.

    Comments: 6 pages, 4 figures, 4 tables, HSI 2025

  2. arXiv:2506.12182  [pdf, ps, other

    cs.CL

    Instruction Tuning and CoT Prompting for Contextual Medical QA with LLMs

    Authors: Chenqian Le, Ziheng Gong, Chihang Wang, Haowei Ni, Panfeng Li, Xupeng Chen

    Abstract: Large language models (LLMs) have shown great potential in medical question answering (MedQA), yet adapting them to biomedical reasoning remains challenging due to domain-specific complexity and limited supervision. In this work, we study how prompt design and lightweight fine-tuning affect the performance of open-source LLMs on PubMedQA, a benchmark for multiple-choice biomedical questions. We fo… ▽ More

    Submitted 13 June, 2025; originally announced June 2025.

    Comments: Accepted by 2025 International Conference on Artificial Intelligence, Human-Computer Interaction and Natural Language Processing

  3. arXiv:2506.06262  [pdf, ps, other

    cs.RO cs.SE eess.SY

    PyGemini: Unified Software Development towards Maritime Autonomy Systems

    Authors: Kjetil Vasstein, Christian Le, Simon Lervåg Breivik, Trygve Maukon Myhr, Annette Stahl, Edmund Førland Brekke

    Abstract: Ensuring the safety and certifiability of autonomous surface vessels (ASVs) requires robust decision-making systems, supported by extensive simulation, testing, and validation across a broad range of scenarios. However, the current landscape of maritime autonomy development is fragmented -- relying on disparate tools for communication, simulation, monitoring, and system integration -- which hamper… ▽ More

    Submitted 6 June, 2025; originally announced June 2025.

    Comments: Preprint. Not yet submitted for peer review. Includes 14 figures and 3 tables. 18 pages, 1 appendix

    ACM Class: D.2.11; I.6.2; I.2.9

  4. arXiv:2506.03722  [pdf, other

    cs.CL cs.SD eess.AS

    MFLA: Monotonic Finite Look-ahead Attention for Streaming Speech Recognition

    Authors: Yinfeng Xia, Huiyan Li, Chenyang Le, Manhong Wang, Yutao Sun, Xingyang Ma, Yanmin Qian

    Abstract: Applying large pre-trained speech models like Whisper has shown promise in reducing training costs for various speech tasks. However, integrating these models into streaming systems remains a challenge. This paper presents a novel prefix-to-prefix training framework for streaming recognition by fine-tuning the Whisper. We introduce the Continuous Integrate-and-Fire mechanism to establish a quasi-m… ▽ More

    Submitted 4 June, 2025; originally announced June 2025.

    Comments: Accepted by Interspeech 2025

  5. arXiv:2505.21992  [pdf

    cs.RO

    Soft Electrothermal Meta-Actuator for Robust Multifunctional Control

    Authors: Hanseong Jo, Pavel Shafirin, Christopher Le, Caden Chan, Artur Davoyan

    Abstract: Soft electrothermal actuators are of great interest in diverse application domains for their simplicity, compliance, and ease of control. However, the very nature of thermally induced mechanical actuation sets inherent operation constraints: unidirectional motion, environmental sensitivity, and slow response times limited by passive cooling. To overcome these constraints, we propose a meta-actuato… ▽ More

    Submitted 28 May, 2025; originally announced May 2025.

    Comments: 23 pages, 5 figures

  6. arXiv:2505.20030  [pdf, ps, other

    cs.LG cs.AI nlin.CD physics.comp-ph

    Multiple Descents in Deep Learning as a Sequence of Order-Chaos Transitions

    Authors: Wenbo Wei, Nicholas Chong Jia Le, Choy Heng Lai, Ling Feng

    Abstract: We observe a novel 'multiple-descent' phenomenon during the training process of LSTM, in which the test loss goes through long cycles of up and down trend multiple times after the model is overtrained. By carrying out asymptotic stability analysis of the models, we found that the cycles in test loss are closely associated with the phase transition process between order and chaos, and the local opt… ▽ More

    Submitted 26 May, 2025; originally announced May 2025.

  7. arXiv:2505.07583  [pdf, ps, other

    cs.SE

    Privacy-Preserving Real-Time Vietnamese-English Translation on iOS using Edge AI

    Authors: Cong Le

    Abstract: This research addresses the growing need for privacy-preserving and accessible language translation by developing a fully offline Neural Machine Translation (NMT) system for Vietnamese-English translation on iOS devices. Given increasing concerns about data privacy and unreliable network connectivity, on-device translation offers critical advantages. This project confronts challenges in deploying… ▽ More

    Submitted 12 May, 2025; originally announced May 2025.

  8. arXiv:2504.21017  [pdf, ps, other

    cs.CL cs.LG

    ViQA-COVID: COVID-19 Machine Reading Comprehension Dataset for Vietnamese

    Authors: Hai-Chung Nguyen-Phung, Ngoc C. Lê, Van-Chien Nguyen, Hang Thi Nguyen, Thuy Phuong Thi Nguyen

    Abstract: After two years of appearance, COVID-19 has negatively affected people and normal life around the world. As in May 2022, there are more than 522 million cases and six million deaths worldwide (including nearly ten million cases and over forty-three thousand deaths in Vietnam). Economy and society are both severely affected. The variant of COVID-19, Omicron, has broken disease prevention measures o… ▽ More

    Submitted 14 June, 2025; v1 submitted 21 April, 2025; originally announced April 2025.

    Comments: 8 pages. Technical report

  9. arXiv:2504.21016  [pdf, ps, other

    cs.CL cs.LG

    Nested Named-Entity Recognition on Vietnamese COVID-19: Dataset and Experiments

    Authors: Ngoc C. Lê, Hai-Chung Nguyen-Phung, Thu-Huong Pham Thi, Hue Vu, Phuong-Thao Nguyen Thi, Thu-Thuy Tran, Hong-Nhung Le Thi, Thuy-Duong Nguyen-Thi, Thanh-Huy Nguyen

    Abstract: The COVID-19 pandemic caused great losses worldwide, efforts are taken place to prevent but many countries have failed. In Vietnam, the traceability, localization, and quarantine of people who contact with patients contribute to effective disease prevention. However, this is done by hand, and take a lot of work. In this research, we describe a named-entity recognition (NER) study that assists in t… ▽ More

    Submitted 14 June, 2025; v1 submitted 21 April, 2025; originally announced April 2025.

    Comments: 8 pages. AI4SG-21 The 3rd Workshop on Artificial Intelligence for Social Good at IJCAI 2021

  10. arXiv:2504.18942  [pdf, other

    cs.CL cs.AI cs.LG

    LawFlow : Collecting and Simulating Lawyers' Thought Processes

    Authors: Debarati Das, Khanh Chi Le, Ritik Sachin Parkar, Karin De Langis, Brendan Madson, Chad M. Berryman, Robin M. Willis, Daniel H. Moses, Brett McDonnell, Daniel Schwarcz, Dongyeop Kang

    Abstract: Legal practitioners, particularly those early in their careers, face complex, high-stakes tasks that require adaptive, context-sensitive reasoning. While AI holds promise in supporting legal work, current datasets and models are narrowly focused on isolated subtasks and fail to capture the end-to-end decision-making required in real-world practice. To address this gap, we introduce LawFlow, a data… ▽ More

    Submitted 26 April, 2025; originally announced April 2025.

    Comments: submitted to COLM 2025

  11. arXiv:2504.07334  [pdf, other

    cs.CV cs.AI cs.LG

    Objaverse++: Curated 3D Object Dataset with Quality Annotations

    Authors: Chendi Lin, Heshan Liu, Qunshu Lin, Zachary Bright, Shitao Tang, Yihui He, Minghao Liu, Ling Zhu, Cindy Le

    Abstract: This paper presents Objaverse++, a curated subset of Objaverse enhanced with detailed attribute annotations by human experts. Recent advances in 3D content generation have been driven by large-scale datasets such as Objaverse, which contains over 800,000 3D objects collected from the Internet. Although Objaverse represents the largest available 3D asset collection, its utility is limited by the pr… ▽ More

    Submitted 11 April, 2025; v1 submitted 9 April, 2025; originally announced April 2025.

    Comments: 8 pages, 8 figures. Accepted to CVPR 2025 Workshop on Efficient Large Vision Models (April 2025)

    MSC Class: 68T45; 68T07 ACM Class: I.2.10; I.3.5; I.3.7; I.4.8; I.5.1

  12. arXiv:2504.02789  [pdf, other

    cs.CL

    A Framework for Robust Cognitive Evaluation of LLMs

    Authors: Karin de Langis, Jong Inn Park, Bin Hu, Khanh Chi Le, Andreas Schramm, Michael C. Mensink, Andrew Elfenbein, Dongyeop Kang

    Abstract: Emergent cognitive abilities in large language models (LLMs) have been widely observed, but their nature and underlying mechanisms remain poorly understood. A growing body of research draws on cognitive science to investigate LLM cognition, but standard methodologies and experimen-tal pipelines have not yet been established. To address this gap we develop CognitivEval, a framework for systematical… ▽ More

    Submitted 3 April, 2025; originally announced April 2025.

  13. arXiv:2503.18201  [pdf, other

    cs.LG

    Iterative Multi-Agent Reinforcement Learning: A Novel Approach Toward Real-World Multi-Echelon Inventory Optimization

    Authors: Georg Ziegner, Michael Choi, Hung Mac Chan Le, Sahil Sakhuja, Arash Sarmadi

    Abstract: Multi-echelon inventory optimization (MEIO) is critical for effective supply chain management, but its inherent complexity can pose significant challenges. Heuristics are commonly used to address this complexity, yet they often face limitations in scope and scalability. Recent research has found deep reinforcement learning (DRL) to be a promising alternative to traditional heuristics, offering gre… ▽ More

    Submitted 23 March, 2025; originally announced March 2025.

    Comments: A Capstone Report in the Field of Data Science for the Degree of Master of Liberal Arts in Extension Studies - Harvard University

  14. arXiv:2503.14641  [pdf

    cs.SI eess.SY

    Link Prediction and Navigability of Multiplex Energy Networks

    Authors: Muhammad Kazim, Harun Pirim, Chau Le, Trung Le, Om Prakash Yadav

    Abstract: In modern energy networks, where operational efficiency and resilience are critical, this study introduces an in-depth analysis from a multiplex network perspective - defined as a network where multiple types of connections exist between the same set of nodes. Utilizing Belgium's electricity and gas networks, we construct a five-layer multiplex network to simulate random node shutdown scenarios. W… ▽ More

    Submitted 18 March, 2025; originally announced March 2025.

  15. arXiv:2503.13478  [pdf

    eess.SP cs.CR cs.CY

    Advancing Highway Work Zone Safety: A Comprehensive Review of Sensor Technologies for Intrusion and Proximity Hazards

    Authors: Ayenew Yihune Demeke, Moein Younesi Heravi, Israt Sharmin Dola, Youjin Jang, Chau Le, Inbae Jeong, Zhibin Lin, Danling Wang

    Abstract: Highway work zones are critical areas where accidents frequently occur, often due to the proximity of workers to heavy machinery and ongoing traffic. With technological advancements in sensor technologies and the Internet of Things, promising solutions are emerging to address these safety concerns. This paper provides a systematic review of existing studies on the application of sensor technologie… ▽ More

    Submitted 4 March, 2025; originally announced March 2025.

    Comments: 4 Figures, 5 Tables

  16. arXiv:2502.08371  [pdf, other

    cs.CL

    Unveiling Global Discourse Structures: Theoretical Analysis and NLP Applications in Argument Mining

    Authors: Christopher van Le

    Abstract: Particularly in the structure of global discourse, coherence plays a pivotal role in human text comprehension and is a hallmark of high-quality text. This is especially true for persuasive texts, where coherent argument structures support claims effectively. This paper discusses and proposes methods for detecting, extracting and representing these global discourse structures in a proccess called A… ▽ More

    Submitted 12 February, 2025; originally announced February 2025.

  17. arXiv:2502.07937  [pdf, ps, other

    cs.LG stat.ML

    Active Advantage-Aligned Online Reinforcement Learning with Offline Data

    Authors: Xuefeng Liu, Hung T. C. Le, Siyu Chen, Rick Stevens, Zhuoran Yang, Matthew R. Walter, Yuxin Chen

    Abstract: Online reinforcement learning (RL) enhances policies through direct interactions with the environment, but faces challenges related to sample efficiency. In contrast, offline RL leverages extensive pre-collected data to learn policies, but often produces suboptimal results due to limited data coverage. Recent efforts integrate offline and online RL in order to harness the advantages of both approa… ▽ More

    Submitted 30 May, 2025; v1 submitted 11 February, 2025; originally announced February 2025.

  18. arXiv:2412.08250  [pdf, other

    cs.IT

    Fast Beam Placement for Ultra-Dense LEO Networks

    Authors: Trinh Van Chien, Nguyen Minh Quan, Tri Nhu Do, Cuong Le, Tan N. Nguyen, Symeon Chatzinotas

    Abstract: Low Earth orbit (LEO) satellites has brought about significant improvements in wireless communications, characterized by low latency and reduced transmission loss compared to geostationary orbit (GSO) satellites. Ultra-dense LEO satellites can serve many users by generating active beams effective to their locations. The beam placement problem is challenging but important for efficiently allocating… ▽ More

    Submitted 11 December, 2024; originally announced December 2024.

    Comments: 5 pages, 3 figures. Accepted by IEEE WCL

  19. arXiv:2411.05524  [pdf, other

    cs.CV cs.GR

    Alignment of 3D woodblock geometrical models and 2D orthographic projection image

    Authors: Minh DUc Nguyen, Cong Thuong Le, Trong Lam Nguyen

    Abstract: The accurate alignment of 3D woodblock geometrical models with 2D orthographic projection images presents a significant challenge in the digital preservation of Vietnamese cultural heritage. This paper proposes a unified image processing algorithm to address this issue, enhancing the registration quality between 3D woodblock models and their 2D representations. The method includes determining the… ▽ More

    Submitted 8 November, 2024; originally announced November 2024.

  20. arXiv:2410.23402  [pdf, other

    cs.SE

    VisualCoder: Guiding Large Language Models in Code Execution with Fine-grained Multimodal Chain-of-Thought Reasoning

    Authors: Cuong Chi Le, Hoang-Chau Truong-Vinh, Huy Nhat Phan, Dung Duy Le, Tien N. Nguyen, Nghi D. Q. Bui

    Abstract: Predicting program behavior and reasoning about code execution remain significant challenges in software engineering, particularly for large language models (LLMs) designed for code analysis. While these models excel at understanding static syntax, they often struggle with dynamic reasoning tasks. We introduce VisualCoder, a simple yet effective approach that enhances code reasoning by integrating… ▽ More

    Submitted 9 February, 2025; v1 submitted 30 October, 2024; originally announced October 2024.

    Comments: NAACL 2025

  21. arXiv:2410.07795  [pdf, other

    cs.CV

    Optimal-state Dynamics Estimation for Physics-based Human Motion Capture from Videos

    Authors: Cuong Le, Viktor Johansson, Manon Kok, Bastian Wandt

    Abstract: Human motion capture from monocular videos has made significant progress in recent years. However, modern approaches often produce temporal artifacts, e.g. in form of jittery motion and struggle to achieve smooth and physically plausible motions. Explicitly integrating physics, in form of internal forces and exterior torques, helps alleviating these artifacts. Current state-of-the-art approaches m… ▽ More

    Submitted 14 May, 2025; v1 submitted 10 October, 2024; originally announced October 2024.

    Comments: 17 pages, 7 figure, NeurIPS 2024

  22. arXiv:2408.13779  [pdf, other

    cs.PL cs.DC

    Concurrent Data Structures Made Easy (Extended Version)

    Authors: Callista Le, Kiran Gopinathan, Koon Wen Lee, Seth Gilbert, Ilya Sergey

    Abstract: Design of an efficient thread-safe concurrent data structure is a balancing act between its implementation complexity and performance. Lock-based concurrent data structures, which are relatively easy to derive from their sequential counterparts and to prove thread-safe, suffer from poor throughput under even light multi-threaded workload. At the same time, lock-free concurrent structures allow for… ▽ More

    Submitted 25 August, 2024; originally announced August 2024.

    Comments: Extended version of the OOPSLA'24 paper

  23. arXiv:2408.12593  [pdf, other

    cs.RO cs.CV

    Automating Deformable Gasket Assembly

    Authors: Simeon Adebola, Tara Sadjadpour, Karim El-Refai, Will Panitch, Zehan Ma, Roy Lin, Tianshuang Qiu, Shreya Ganti, Charlotte Le, Jaimyn Drake, Ken Goldberg

    Abstract: In Gasket Assembly, a deformable gasket must be aligned and pressed into a narrow channel. This task is common for sealing surfaces in the manufacturing of automobiles, appliances, electronics, and other products. Gasket Assembly is a long-horizon, high-precision task and the gasket must align with the channel and be fully pressed in to achieve a secure fit. To compare approaches, we present 4 met… ▽ More

    Submitted 22 August, 2024; originally announced August 2024.

    Comments: Content without Appendix accepted for IEEE CASE 2024

  24. arXiv:2408.02816  [pdf, other

    cs.SE

    CodeFlow: Program Behavior Prediction with Dynamic Dependencies Learning

    Authors: Cuong Chi Le, Hoang Nhat Phan, Huy Nhat Phan, Tien N. Nguyen, Nghi D. Q. Bui

    Abstract: Predicting program behavior without execution is a critical task in software engineering. Existing models often fall short in capturing the dynamic dependencies among program elements. To address this, we present CodeFlow, a novel machine learning-based approach that predicts code coverage and detects runtime errors by learning both static and dynamic dependencies within the code. By using control… ▽ More

    Submitted 9 February, 2025; v1 submitted 5 August, 2024; originally announced August 2024.

    Comments: FORGE 2025

  25. arXiv:2407.19203  [pdf, ps, other

    cs.CR cs.AI

    Towards Clean-Label Backdoor Attacks in the Physical World

    Authors: Thinh Dao, Cuong Chi Le, Khoa D Doan, Kok-Seng Wong

    Abstract: Deep Neural Networks (DNNs) are shown to be vulnerable to backdoor poisoning attacks, with most research focusing on \textbf{digital triggers} -- special patterns added to test-time inputs to induce targeted misclassification. \textbf{Physical triggers}, natural objects within a physical scene, have emerged as a desirable alternative since they enable real-time backdoor activations without digital… ▽ More

    Submitted 7 July, 2025; v1 submitted 27 July, 2024; originally announced July 2024.

    Comments: 21 pages, 17 figures, 16 tables

  26. arXiv:2406.04423  [pdf, other

    stat.ME cs.SI physics.soc-ph

    Determining the Number of Communities in Sparse and Imbalanced Settings

    Authors: Zhixuan Shao, Can M. Le

    Abstract: Community structures represent a crucial aspect of network analysis, and various methods have been developed to identify these communities. However, a common hurdle lies in determining the number of communities K, a parameter that often requires estimation in practice. Existing approaches for estimating K face two notable challenges: the weak community signal present in sparse networks and the imb… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

  27. arXiv:2405.17809  [pdf, other

    cs.CL cs.AI cs.SD eess.AS

    TransVIP: Speech to Speech Translation System with Voice and Isochrony Preservation

    Authors: Chenyang Le, Yao Qian, Dongmei Wang, Long Zhou, Shujie Liu, Xiaofei Wang, Midia Yousefi, Yanmin Qian, Jinyu Li, Sheng Zhao, Michael Zeng

    Abstract: There is a rising interest and trend in research towards directly translating speech from one language to another, known as end-to-end speech-to-speech translation. However, most end-to-end models struggle to outperform cascade models, i.e., a pipeline framework by concatenating speech recognition, machine translation and text-to-speech models. The primary challenges stem from the inherent complex… ▽ More

    Submitted 30 October, 2024; v1 submitted 28 May, 2024; originally announced May 2024.

    Comments: Neural Information Processing Systems, poster

  28. arXiv:2404.11792  [pdf, other

    cs.AI

    Enhancing Q&A with Domain-Specific Fine-Tuning and Iterative Reasoning: A Comparative Study

    Authors: Zooey Nguyen, Anthony Annunziata, Vinh Luong, Sang Dinh, Quynh Le, Anh Hai Ha, Chanh Le, Hong An Phan, Shruti Raghavan, Christopher Nguyen

    Abstract: This paper investigates the impact of domain-specific model fine-tuning and of reasoning mechanisms on the performance of question-answering (Q&A) systems powered by large language models (LLMs) and Retrieval-Augmented Generation (RAG). Using the FinanceBench SEC financial filings dataset, we observe that, for RAG, combining a fine-tuned embedding model with a fine-tuned LLM achieves better accura… ▽ More

    Submitted 19 April, 2024; v1 submitted 17 April, 2024; originally announced April 2024.

    Comments: Fixed typo of OODA's score on harder-question set in Table 2

  29. arXiv:2404.10279  [pdf, other

    cs.CV

    EucliDreamer: Fast and High-Quality Texturing for 3D Models with Depth-Conditioned Stable Diffusion

    Authors: Cindy Le, Congrui Hetang, Chendi Lin, Ang Cao, Yihui He

    Abstract: We present EucliDreamer, a simple and effective method to generate textures for 3D models given text prompts and meshes. The texture is parametrized as an implicit function on the 3D surface, which is optimized with the Score Distillation Sampling (SDS) process and differentiable rendering. To generate high-quality textures, we leverage a depth-conditioned Stable Diffusion model guided by the dept… ▽ More

    Submitted 16 April, 2024; originally announced April 2024.

    Comments: Short version of arXiv:2311.15573

  30. arXiv:2403.16051  [pdf, other

    cs.CV

    Segment Anything Model for Road Network Graph Extraction

    Authors: Congrui Hetang, Haoru Xue, Cindy Le, Tianwei Yue, Wenping Wang, Yihui He

    Abstract: We propose SAM-Road, an adaptation of the Segment Anything Model (SAM) for extracting large-scale, vectorized road network graphs from satellite imagery. To predict graph geometry, we formulate it as a dense semantic segmentation task, leveraging the inherent strengths of SAM. The image encoder of SAM is fine-tuned to produce probability masks for roads and intersections, from which the graph vert… ▽ More

    Submitted 12 April, 2024; v1 submitted 24 March, 2024; originally announced March 2024.

    Comments: Accepted by IEEE/CVF Computer Vision and Pattern Recognition Conference (CVPR) 2024, 2nd Workshop on Scene Graphs and Graph Representation Learning

  31. arXiv:2403.12945  [pdf, other

    cs.RO

    DROID: A Large-Scale In-The-Wild Robot Manipulation Dataset

    Authors: Alexander Khazatsky, Karl Pertsch, Suraj Nair, Ashwin Balakrishna, Sudeep Dasari, Siddharth Karamcheti, Soroush Nasiriany, Mohan Kumar Srirama, Lawrence Yunliang Chen, Kirsty Ellis, Peter David Fagan, Joey Hejna, Masha Itkina, Marion Lepert, Yecheng Jason Ma, Patrick Tree Miller, Jimmy Wu, Suneel Belkhale, Shivin Dass, Huy Ha, Arhan Jain, Abraham Lee, Youngwoon Lee, Marius Memmel, Sungjae Park , et al. (76 additional authors not shown)

    Abstract: The creation of large, diverse, high-quality robot manipulation datasets is an important stepping stone on the path toward more capable and robust robotic manipulation policies. However, creating such datasets is challenging: collecting robot manipulation data in diverse environments poses logistical and safety challenges and requires substantial investments in hardware and human labour. As a resu… ▽ More

    Submitted 22 April, 2025; v1 submitted 19 March, 2024; originally announced March 2024.

    Comments: Project website: https://droid-dataset.github.io/

  32. arXiv:2401.03790  [pdf, other

    cs.LG cs.CR cs.PL cs.SE

    Inferring Properties of Graph Neural Networks

    Authors: Dat Nguyen, Hieu M. Vu, Cong-Thanh Le, Bach Le, David Lo, ThanhVu Nguyen, Corina Pasareanu

    Abstract: We propose GNNInfer, the first automatic property inference technique for GNNs. To tackle the challenge of varying input structures in GNNs, GNNInfer first identifies a set of representative influential structures that contribute significantly towards the prediction of a GNN. Using these structures, GNNInfer converts each pair of an influential structure and the GNN to their equivalent FNN and the… ▽ More

    Submitted 2 March, 2024; v1 submitted 8 January, 2024; originally announced January 2024.

    Comments: 20 pages main paper, 10 pages for appendix

  33. arXiv:2312.16717  [pdf, other

    cs.CV cs.LG eess.IV

    Landslide Detection and Segmentation Using Remote Sensing Images and Deep Neural Network

    Authors: Cam Le, Lam Pham, Jasmin Lampert, Matthias Schlögl, Alexander Schindler

    Abstract: Knowledge about historic landslide event occurrence is important for supporting disaster risk reduction strategies. Building upon findings from 2022 Landslide4Sense Competition, we propose a deep neural network based system for landslide detection and segmentation from multisource remote sensing image input. We use a U-Net trained with Cross Entropy loss as baseline model. We then improve the U-Ne… ▽ More

    Submitted 27 December, 2023; originally announced December 2023.

  34. arXiv:2311.15573  [pdf, other

    cs.CV cs.GR

    EucliDreamer: Fast and High-Quality Texturing for 3D Models with Stable Diffusion Depth

    Authors: Cindy Le, Congrui Hetang, Chendi Lin, Ang Cao, Yihui He

    Abstract: This paper presents a novel method to generate textures for 3D models given text prompts and 3D meshes. Additional depth information is taken into account to perform the Score Distillation Sampling (SDS) process with depth conditional Stable Diffusion. We ran our model over the open-source dataset Objaverse and conducted a user study to compare the results with those of various 3D texturing method… ▽ More

    Submitted 13 March, 2024; v1 submitted 27 November, 2023; originally announced November 2023.

  35. arXiv:2311.05600  [pdf, other

    cs.RO eess.SY

    FogROS2-Config: Optimizing Latency and Cost for Multi-Cloud Robot Applications

    Authors: Kaiyuan Chen, Kush Hari, Rohil Khare, Charlotte Le, Trinity Chung, Jaimyn Drake, Jeffrey Ichnowski, John Kubiatowicz, Ken Goldberg

    Abstract: Cloud service providers provide over 50,000 distinct and dynamically changing set of cloud server options. To help roboticists make cost-effective decisions, we present FogROS2-Config, an open toolkit that takes ROS2 nodes as input and automatically runs relevant benchmarks to quickly return a menu of cloud compute services that tradeoff latency and cost. Because it is infeasible to try every hard… ▽ More

    Submitted 13 May, 2024; v1 submitted 9 November, 2023; originally announced November 2023.

    Comments: Published 2024 IEEE International Conference on Robotics and Automation (ICRA), Former name: FogROS2-Sky

  36. arXiv:2311.03630  [pdf, ps, other

    cs.LG stat.ME stat.ML

    CATE Estimation With Potential Outcome Imputation From Local Regression

    Authors: Ahmed Aloui, Juncheng Dong, Cat P. Le, Vahid Tarokh

    Abstract: One of the most significant challenges in Conditional Average Treatment Effect (CATE) estimation is the statistical discrepancy between distinct treatment groups. To address this issue, we propose a model-agnostic data augmentation method for CATE estimation. First, we derive regret bounds for general data augmentation methods suggesting that a small imputation error may be necessary for accurate… ▽ More

    Submitted 13 June, 2025; v1 submitted 6 November, 2023; originally announced November 2023.

  37. arXiv:2311.02803  [pdf, other

    cs.CV

    Fast and Interpretable Face Identification for Out-Of-Distribution Data Using Vision Transformers

    Authors: Hai Phan, Cindy Le, Vu Le, Yihui He, Anh Totti Nguyen

    Abstract: Most face identification approaches employ a Siamese neural network to compare two images at the image embedding level. Yet, this technique can be subject to occlusion (e.g. faces with masks or sunglasses) and out-of-distribution data. DeepFace-EMD (Phan et al. 2022) reaches state-of-the-art accuracy on out-of-distribution data by first comparing two images at the image level, and then at the patc… ▽ More

    Submitted 5 November, 2023; originally announced November 2023.

    Comments: 20 pages, 15 Figures

  38. arXiv:2310.08864  [pdf, other

    cs.RO

    Open X-Embodiment: Robotic Learning Datasets and RT-X Models

    Authors: Open X-Embodiment Collaboration, Abby O'Neill, Abdul Rehman, Abhinav Gupta, Abhiram Maddukuri, Abhishek Gupta, Abhishek Padalkar, Abraham Lee, Acorn Pooley, Agrim Gupta, Ajay Mandlekar, Ajinkya Jain, Albert Tung, Alex Bewley, Alex Herzog, Alex Irpan, Alexander Khazatsky, Anant Rai, Anchit Gupta, Andrew Wang, Andrey Kolobov, Anikait Singh, Animesh Garg, Aniruddha Kembhavi, Annie Xie , et al. (269 additional authors not shown)

    Abstract: Large, high-capacity models trained on diverse datasets have shown remarkable successes on efficiently tackling downstream applications. In domains from NLP to Computer Vision, this has led to a consolidation of pretrained models, with general pretrained backbones serving as a starting point for many applications. Can such a consolidation happen in robotics? Conventionally, robotic learning method… ▽ More

    Submitted 14 May, 2025; v1 submitted 13 October, 2023; originally announced October 2023.

    Comments: Project website: https://robotics-transformer-x.github.io

  39. arXiv:2310.01720  [pdf, other

    cs.LG cs.AI

    Perceiver-based CDF Modeling for Time Series Forecasting

    Authors: Cat P. Le, Chris Cannella, Ali Hasan, Yuting Ng, Vahid Tarokh

    Abstract: Transformers have demonstrated remarkable efficacy in forecasting time series data. However, their extensive dependence on self-attention mechanisms demands significant computational resources, thereby limiting their practical applicability across diverse tasks, especially in multimodal problems. In this work, we propose a new architecture, called perceiver-CDF, for modeling cumulative distributio… ▽ More

    Submitted 24 June, 2024; v1 submitted 2 October, 2023; originally announced October 2023.

    Comments: Accepted in Winter Simulation Conference 2024

  40. arXiv:2308.06539  [pdf, other

    cs.IT eess.SP

    Phase Shift Design for RIS-Aided Cell-Free Massive MIMO with Improved Differential Evolution

    Authors: Trinh Van Chien, Cuong V. Le, Huynh Thi Thanh Binh, Hien Quoc Ngo, Symeon Chatzinotas

    Abstract: This paper proposes a novel phase shift design for cell-free massive multiple-input and multiple-output (MIMO) systems assisted by reconfigurable intelligent surface (RIS), which only utilizes channel statistics to achieve the uplink sum ergodic throughput maximization under spatial channel correlations. Due to the non-convexity and the scale of the derived optimization problem, we develop an impr… ▽ More

    Submitted 12 August, 2023; originally announced August 2023.

    Comments: 5 pages, 2 figures. Accepted by IEEE WCL

  41. arXiv:2307.16834  [pdf

    cs.CV cs.AI cs.LG eess.IV

    Benchmarking Jetson Edge Devices with an End-to-end Video-based Anomaly Detection System

    Authors: Hoang Viet Pham, Thinh Gia Tran, Chuong Dinh Le, An Dinh Le, Hien Bich Vo

    Abstract: Innovative enhancement in embedded system platforms, specifically hardware accelerations, significantly influence the application of deep learning in real-world scenarios. These innovations translate human labor efforts into automated intelligent systems employed in various areas such as autonomous driving, robotics, Internet-of-Things (IoT), and numerous other impactful applications. NVIDIA's Jet… ▽ More

    Submitted 12 September, 2023; v1 submitted 28 July, 2023; originally announced July 2023.

    Comments: Accepted in Future of Information and Communication Conference (FICC) 2024

  42. arXiv:2306.16678  [pdf, other

    cs.CV cs.LG

    BinaryViT: Pushing Binary Vision Transformers Towards Convolutional Models

    Authors: Phuoc-Hoan Charles Le, Xinlin Li

    Abstract: With the increasing popularity and the increasing size of vision transformers (ViTs), there has been an increasing interest in making them more efficient and less computationally costly for deployment on edge devices with limited computing resources. Binarization can be used to help reduce the size of ViT models and their computational cost significantly, using popcount operations when the weights… ▽ More

    Submitted 29 June, 2023; originally announced June 2023.

    Comments: Accepted in CVPR 2023 Workshop on Efficient Deep Learning for Computer Vision (ECV)

  43. arXiv:2305.15613  [pdf, other

    cs.LG

    O$n$ Learning Deep O($n$)-Equivariant Hyperspheres

    Authors: Pavlo Melnyk, Michael Felsberg, Mårten Wadenbäck, Andreas Robinson, Cuong Le

    Abstract: In this paper, we utilize hyperspheres and regular $n$-simplexes and propose an approach to learning deep features equivariant under the transformations of $n$D reflections and rotations, encompassed by the powerful group of O$(n)$. Namely, we propose O$(n)$-equivariant neurons with spherical decision surfaces that generalize to any dimension $n$, which we call Deep Equivariant Hyperspheres. We de… ▽ More

    Submitted 27 May, 2024; v1 submitted 24 May, 2023; originally announced May 2023.

    Comments: Proceedings of the 41st International Conference on Machine Learning, Vienna, Austria. PMLR 235, 2024

  44. arXiv:2305.14838  [pdf, other

    cs.CL cs.SD eess.AS

    ComSL: A Composite Speech-Language Model for End-to-End Speech-to-Text Translation

    Authors: Chenyang Le, Yao Qian, Long Zhou, Shujie Liu, Yanmin Qian, Michael Zeng, Xuedong Huang

    Abstract: Joint speech-language training is challenging due to the large demand for training data and GPU consumption, as well as the modality gap between speech and language. We present ComSL, a speech-language model built atop a composite architecture of public pretrained speech-only and language-only models and optimized data-efficiently for spoken language tasks. Particularly, we propose to incorporate… ▽ More

    Submitted 14 October, 2023; v1 submitted 24 May, 2023; originally announced May 2023.

    Comments: NeurIPS 2023, Poster

  45. arXiv:2305.11400  [pdf, other

    cs.LG stat.ML

    Mode-Aware Continual Learning for Conditional Generative Adversarial Networks

    Authors: Cat P. Le, Juncheng Dong, Ahmed Aloui, Vahid Tarokh

    Abstract: The main challenge in continual learning for generative models is to effectively learn new target modes with limited samples while preserving previously learned ones. To this end, we introduce a new continual learning approach for conditional generative adversarial networks by leveraging a mode-affinity score specifically designed for generative modeling. First, the generator produces samples of e… ▽ More

    Submitted 23 September, 2023; v1 submitted 18 May, 2023; originally announced May 2023.

  46. arXiv:2305.09463  [pdf, other

    cs.SD cs.AI eess.AS

    Low-complexity deep learning frameworks for acoustic scene classification using teacher-student scheme and multiple spectrograms

    Authors: Lam Pham, Dat Ngo, Cam Le, Anahid Jalali, Alexander Schindler

    Abstract: In this technical report, a low-complexity deep learning system for acoustic scene classification (ASC) is presented. The proposed system comprises two main phases: (Phase I) Training a teacher network; and (Phase II) training a student network using distilled knowledge from the teacher. In the first phase, the teacher, which presents a large footprint model, is trained. After training the teacher… ▽ More

    Submitted 16 May, 2023; originally announced May 2023.

    Comments: arXiv admin note: text overlap with arXiv:2206.06057

  47. arXiv:2305.01476  [pdf, other

    cs.SD cs.MM eess.AS

    Deep Learning Based Multimodal with Two-phase Training Strategy for Daily Life Video Classification

    Authors: Lam Pham, Trang Le, Cam Le, Dat Ngo, Weissenfeld Axel, Alexander Schindler

    Abstract: In this paper, we present a deep learning based multimodal system for classifying daily life videos. To train the system, we propose a two-phase training strategy. In the first training phase (Phase I), we extract the audio and visual (image) data from the original video. We then train the audio data and the visual data with independent deep learning based models. After the training processes, we… ▽ More

    Submitted 30 April, 2023; originally announced May 2023.

  48. arXiv:2302.13028  [pdf, other

    cs.CV cs.AI cs.LG

    A Light-weight Deep Learning Model for Remote Sensing Image Classification

    Authors: Lam Pham, Cam Le, Dat Ngo, Anh Nguyen, Jasmin Lampert, Alexander Schindler, Ian McLoughlin

    Abstract: In this paper, we present a high-performance and light-weight deep learning model for Remote Sensing Image Classification (RSIC), the task of identifying the aerial scene of a remote sensing image. To this end, we first valuate various benchmark convolutional neural network (CNN) architectures: MobileNet V1/V2, ResNet 50/151V2, InceptionV3/InceptionResNetV2, EfficientNet B0/B7, DenseNet 121/201, C… ▽ More

    Submitted 25 February, 2023; originally announced February 2023.

  49. arXiv:2301.13372  [pdf, other

    cs.CL cs.AI

    Improving Open-Domain Dialogue Evaluation with a Causal Inference Model

    Authors: Cat P. Le, Luke Dai, Michael Johnston, Yang Liu, Marilyn Walker, Reza Ghanadan

    Abstract: Effective evaluation methods remain a significant challenge for research on open-domain conversational dialogue systems. Explicit satisfaction ratings can be elicited from users, but users often do not provide ratings when asked, and those they give can be highly subjective. Post-hoc ratings by experts are an alternative, but these can be both expensive and complex to collect. Here, we explore the… ▽ More

    Submitted 30 January, 2023; originally announced January 2023.

    Comments: Accepted as a conference paper at IWSDS 2023

  50. arXiv:2301.08530  [pdf, other

    physics.data-an cs.AI nlin.AO

    Self-Organization Towards $1/f$ Noise in Deep Neural Networks

    Authors: Nicholas Chong Jia Le, Ling Feng

    Abstract: The presence of $1/f$ noise, also known as pink noise, is a well-established phenomenon in biological neural networks, and is thought to play an important role in information processing in the brain. In this study, we find that such $1/f$ noise is also found in deep neural networks trained on natural language, resembling that of their biological counterparts. Specifically, we trained Long Short-Te… ▽ More

    Submitted 1 April, 2024; v1 submitted 20 January, 2023; originally announced January 2023.