-
A Unified Framework for Simulating Strongly-Coupled Fluid-Robot Multiphysics
Authors:
Jeong Hun Lee,
Junzhe Hu,
Sofia Kwok,
Carmel Majidi,
Zachary Manchester
Abstract:
We present a framework for simulating fluid-robot multiphysics as a single, unified optimization problem. The coupled manipulator and incompressible Navier-Stokes equations governing the robot and fluid dynamics are derived together from a single Lagrangian using the principal of least action. We then employ discrete variational mechanics to derive a stable, implicit time-integration scheme for jo…
▽ More
We present a framework for simulating fluid-robot multiphysics as a single, unified optimization problem. The coupled manipulator and incompressible Navier-Stokes equations governing the robot and fluid dynamics are derived together from a single Lagrangian using the principal of least action. We then employ discrete variational mechanics to derive a stable, implicit time-integration scheme for jointly simulating both the fluid and robot dynamics, which are tightly coupled by a constraint that enforces the no-slip boundary condition at the fluid-robot interface. Extending the classical immersed boundary method, we derive a new formulation of the no-slip constraint that is numerically well-conditioned and physically accurate for multibody systems commonly found in robotics. We demonstrate our approach's physical accuracy on benchmark computational fluid-dynamics problems, including Poiseuille flow and a disc in free stream. We then design a locomotion policy for a novel swimming robot in simulation and validate results on real-world hardware, showcasing our framework's sim-to-real capability for robotics tasks.
△ Less
Submitted 5 June, 2025;
originally announced June 2025.
-
FinSage: A Multi-aspect RAG System for Financial Filings Question Answering
Authors:
Xinyu Wang,
Jijun Chi,
Zhenghan Tai,
Tung Sum Thomas Kwok,
Muzhi Li,
Zhuhong Li,
Hailin He,
Yuchen Hua,
Peng Lu,
Suyuchen Wang,
Yihong Wu,
Jerry Huang,
Jingrui Tian,
Fengran Mo,
Yufei Cui,
Ling Zhou
Abstract:
Leveraging large language models in real-world settings often entails a need to utilize domain-specific data and tools in order to follow the complex regulations that need to be followed for acceptable use. Within financial sectors, modern enterprises increasingly rely on Retrieval-Augmented Generation (RAG) systems to address complex compliance requirements in financial document workflows. Howeve…
▽ More
Leveraging large language models in real-world settings often entails a need to utilize domain-specific data and tools in order to follow the complex regulations that need to be followed for acceptable use. Within financial sectors, modern enterprises increasingly rely on Retrieval-Augmented Generation (RAG) systems to address complex compliance requirements in financial document workflows. However, existing solutions struggle to account for the inherent heterogeneity of data (e.g., text, tables, diagrams) and evolving nature of regulatory standards used in financial filings, leading to compromised accuracy in critical information extraction. We propose the FinSage framework as a solution, utilizing a multi-aspect RAG framework tailored for regulatory compliance analysis in multi-modal financial documents. FinSage introduces three innovative components: (1) a multi-modal pre-processing pipeline that unifies diverse data formats and generates chunk-level metadata summaries, (2) a multi-path sparse-dense retrieval system augmented with query expansion (HyDE) and metadata-aware semantic search, and (3) a domain-specialized re-ranking module fine-tuned via Direct Preference Optimization (DPO) to prioritize compliance-critical content. Extensive experiments demonstrate that FinSage achieves an impressive recall of 92.51% on 75 expert-curated questions derived from surpasses the best baseline method on the FinanceBench question answering datasets by 24.06% in accuracy. Moreover, FinSage has been successfully deployed as financial question-answering agent in online meetings, where it has already served more than 1,200 people.
△ Less
Submitted 6 June, 2025; v1 submitted 20 April, 2025;
originally announced April 2025.
-
GReaTER: Generate Realistic Tabular data after data Enhancement and Reduction
Authors:
Tung Sum Thomas Kwok,
Chi-Hua Wang,
Guang Cheng
Abstract:
Tabular data synthesis involves not only multi-table synthesis but also generating multi-modal data (e.g., strings and categories), which enables diverse knowledge synthesis. However, separating numerical and categorical data has limited the effectiveness of tabular data generation. The GReaT (Generate Realistic Tabular Data) framework uses Large Language Models (LLMs) to encode entire rows, elimi…
▽ More
Tabular data synthesis involves not only multi-table synthesis but also generating multi-modal data (e.g., strings and categories), which enables diverse knowledge synthesis. However, separating numerical and categorical data has limited the effectiveness of tabular data generation. The GReaT (Generate Realistic Tabular Data) framework uses Large Language Models (LLMs) to encode entire rows, eliminating the need to partition data types. Despite this, the framework's performance is constrained by two issues: (1) tabular data entries lack sufficient semantic meaning, limiting LLM's ability to leverage pre-trained knowledge for in-context learning, and (2) complex multi-table datasets struggle to establish effective relationships for collaboration. To address these, we propose GReaTER (Generate Realistic Tabular Data after data Enhancement and Reduction), which includes: (1) a data semantic enhancement system that improves LLM's understanding of tabular data through mapping, enabling better in-context learning, and (2) a cross-table connecting method to establish efficient relationships across complex tables. Experimental results show that GReaTER outperforms the GReaT framework.
△ Less
Submitted 19 March, 2025;
originally announced March 2025.
-
A Faster Algorithm for Maximum Weight Matching on Unrestricted Bipartite Graphs
Authors:
Shawxing Kwok
Abstract:
Given a weighted bipartite graph $G = (L, R, E, w)$, the maximum weight matching (MWM) problem seeks to find a matching $M \subseteq E$ that maximizes the total weight $\sum_{e \in M} w(e)$.
This paper presents a novel algorithm with a time complexity of $O(\min(X^3 + E, XE + X^2\log X))$, where $X = \min(|L|, |R|)$. Unlike many existing algorithms, our approach supports real-valued weights with…
▽ More
Given a weighted bipartite graph $G = (L, R, E, w)$, the maximum weight matching (MWM) problem seeks to find a matching $M \subseteq E$ that maximizes the total weight $\sum_{e \in M} w(e)$.
This paper presents a novel algorithm with a time complexity of $O(\min(X^3 + E, XE + X^2\log X))$, where $X = \min(|L|, |R|)$. Unlike many existing algorithms, our approach supports real-valued weights without additional constraints. Under this condition, our result improves upon the previous best-known bound of $O(VE + V^2\log V)$, or more strictly $O(XE + XV\log V)$, where $V = L \cup R$.
The suggested implementation code is simplified and publicly available at https://github.com/ShawxingKwok/Kwok-algorithm, with the average-case time complexity of $O(E^{1.4} + LR)$ estimated from experimental results on random graphs.
△ Less
Submitted 4 April, 2025; v1 submitted 28 February, 2025;
originally announced February 2025.
-
DEREC-SIMPRO: unlock Language Model benefits to advance Synthesis in Data Clean Room
Authors:
Tung Sum Thomas Kwok,
Chi-hua Wang,
Guang Cheng
Abstract:
Data collaboration via Data Clean Room offers value but raises privacy concerns, which can be addressed through synthetic data and multi-table synthesizers. Common multi-table synthesizers fail to perform when subjects occur repeatedly in both tables. This is an urgent yet unresolved problem, since having both tables with repeating subjects is common. To improve performance in this scenario, we pr…
▽ More
Data collaboration via Data Clean Room offers value but raises privacy concerns, which can be addressed through synthetic data and multi-table synthesizers. Common multi-table synthesizers fail to perform when subjects occur repeatedly in both tables. This is an urgent yet unresolved problem, since having both tables with repeating subjects is common. To improve performance in this scenario, we present the DEREC 3-step pre-processing pipeline to generalize adaptability of multi-table synthesizers. We also introduce the SIMPRO 3-aspect evaluation metrics, which leverage conditional distribution and large-scale simultaneous hypothesis testing to provide comprehensive feedback on synthetic data fidelity at both column and table levels. Results show that using DEREC improves fidelity, and multi-table synthesizers outperform single-table counterparts in collaboration settings. Together, the DEREC-SIMPRO pipeline offers a robust solution for generalizing data collaboration, promoting a more efficient, data-driven society.
△ Less
Submitted 31 October, 2024;
originally announced November 2024.
-
Real-Time Whole-Body Control of Legged Robots with Model-Predictive Path Integral Control
Authors:
Juan Alvarez-Padilla,
John Z. Zhang,
Sofia Kwok,
John M. Dolan,
Zachary Manchester
Abstract:
This paper presents a system for enabling real-time synthesis of whole-body locomotion and manipulation policies for real-world legged robots. Motivated by recent advancements in robot simulation, we leverage the efficient parallelization capabilities of the MuJoCo simulator to achieve fast sampling over the robot state and action trajectories. Our results show surprisingly effective real-world lo…
▽ More
This paper presents a system for enabling real-time synthesis of whole-body locomotion and manipulation policies for real-world legged robots. Motivated by recent advancements in robot simulation, we leverage the efficient parallelization capabilities of the MuJoCo simulator to achieve fast sampling over the robot state and action trajectories. Our results show surprisingly effective real-world locomotion and manipulation capabilities with a very simple control strategy. We demonstrate our approach on several hardware and simulation experiments: robust locomotion over flat and uneven terrains, climbing over a box whose height is comparable to the robot, and pushing a box to a goal position. To our knowledge, this is the first successful deployment of whole-body sampling-based MPC on real-world legged robot hardware. Experiment videos and code can be found at: https://whole-body-mppi.github.io/
△ Less
Submitted 16 September, 2024;
originally announced September 2024.
-
IntentionQA: A Benchmark for Evaluating Purchase Intention Comprehension Abilities of Language Models in E-commerce
Authors:
Wenxuan Ding,
Weiqi Wang,
Sze Heng Douglas Kwok,
Minghao Liu,
Tianqing Fang,
Jiaxin Bai,
Xin Liu,
Changlong Yu,
Zheng Li,
Chen Luo,
Qingyu Yin,
Bing Yin,
Junxian He,
Yangqiu Song
Abstract:
Enhancing Language Models' (LMs) ability to understand purchase intentions in E-commerce scenarios is crucial for their effective assistance in various downstream tasks. However, previous approaches that distill intentions from LMs often fail to generate meaningful and human-centric intentions applicable in real-world E-commerce contexts. This raises concerns about the true comprehension and utili…
▽ More
Enhancing Language Models' (LMs) ability to understand purchase intentions in E-commerce scenarios is crucial for their effective assistance in various downstream tasks. However, previous approaches that distill intentions from LMs often fail to generate meaningful and human-centric intentions applicable in real-world E-commerce contexts. This raises concerns about the true comprehension and utilization of purchase intentions by LMs. In this paper, we present IntentionQA, a double-task multiple-choice question answering benchmark to evaluate LMs' comprehension of purchase intentions in E-commerce. Specifically, LMs are tasked to infer intentions based on purchased products and utilize them to predict additional purchases. IntentionQA consists of 4,360 carefully curated problems across three difficulty levels, constructed using an automated pipeline to ensure scalability on large E-commerce platforms. Human evaluations demonstrate the high quality and low false-negative rate of our benchmark. Extensive experiments across 19 language models show that they still struggle with certain scenarios, such as understanding products and intentions accurately, jointly reasoning with products and intentions, and more, in which they fall far behind human performances. Our code and data are publicly available at https://github.com/HKUST-KnowComp/IntentionQA.
△ Less
Submitted 29 September, 2024; v1 submitted 14 June, 2024;
originally announced June 2024.
-
Social Contract AI: Aligning AI Assistants with Implicit Group Norms
Authors:
Jan-Philipp Fränken,
Sam Kwok,
Peixuan Ye,
Kanishk Gandhi,
Dilip Arumugam,
Jared Moore,
Alex Tamkin,
Tobias Gerstenberg,
Noah D. Goodman
Abstract:
We explore the idea of aligning an AI assistant by inverting a model of users' (unknown) preferences from observed interactions. To validate our proposal, we run proof-of-concept simulations in the economic ultimatum game, formalizing user preferences as policies that guide the actions of simulated players. We find that the AI assistant accurately aligns its behavior to match standard policies fro…
▽ More
We explore the idea of aligning an AI assistant by inverting a model of users' (unknown) preferences from observed interactions. To validate our proposal, we run proof-of-concept simulations in the economic ultimatum game, formalizing user preferences as policies that guide the actions of simulated players. We find that the AI assistant accurately aligns its behavior to match standard policies from the economic literature (e.g., selfish, altruistic). However, the assistant's learned policies lack robustness and exhibit limited generalization in an out-of-distribution setting when confronted with a currency (e.g., grams of medicine) that was not included in the assistant's training distribution. Additionally, we find that when there is inconsistency in the relationship between language use and an unknown policy (e.g., an altruistic policy combined with rude language), the assistant's learning of the policy is slowed. Overall, our preliminary results suggest that developing simulation frameworks in which AI assistants need to infer preferences from diverse users can provide a valuable approach for studying practical alignment questions.
△ Less
Submitted 3 December, 2023; v1 submitted 26 October, 2023;
originally announced October 2023.
-
Investigation of wind pressures on tall building under interference effects using machine learning techniques
Authors:
Gang Hu,
Lingbo Liu,
Dacheng Tao,
Jie Song,
K. C. S. Kwok
Abstract:
Interference effects of tall buildings have attracted numerous studies due to the boom of clusters of tall buildings in megacities. To fully understand the interference effects of buildings, it often requires a substantial amount of wind tunnel tests. Limited wind tunnel tests that only cover part of interference scenarios are unable to fully reveal the interference effects. This study used machin…
▽ More
Interference effects of tall buildings have attracted numerous studies due to the boom of clusters of tall buildings in megacities. To fully understand the interference effects of buildings, it often requires a substantial amount of wind tunnel tests. Limited wind tunnel tests that only cover part of interference scenarios are unable to fully reveal the interference effects. This study used machine learning techniques to resolve the conflicting requirement between limited wind tunnel tests that produce unreliable results and a completed investigation of the interference effects that is costly and time-consuming. Four machine learning models including decision tree, random forest, XGBoost, generative adversarial networks (GANs), were trained based on 30% of a dataset to predict both mean and fluctuating pressure coefficients on the principal building. The GANs model exhibited the best performance in predicting these pressure coefficients. A number of GANs models were then trained based on different portions of the dataset ranging from 10% to 90%. It was found that the GANs model based on 30% of the dataset is capable of predicting both mean and fluctuating pressure coefficients under unseen interference conditions accurately. By using this GANs model, 70% of the wind tunnel test cases can be saved, largely alleviating the cost of this kind of wind tunnel testing study.
△ Less
Submitted 20 August, 2019;
originally announced August 2019.
-
Predicting wind pressures around circular cylinders using machine learning techniques
Authors:
Gang Hu,
K. C. S. Kwok
Abstract:
Numerous studies have been carried out to measure wind pressures around circular cylinders since the early 20th century due to its engineering significance. Consequently, a large amount of wind pressure data sets have accumulated, which presents an excellent opportunity for using machine learning (ML) techniques to train models to predict wind pressures around circular cylinders. Wind pressures ar…
▽ More
Numerous studies have been carried out to measure wind pressures around circular cylinders since the early 20th century due to its engineering significance. Consequently, a large amount of wind pressure data sets have accumulated, which presents an excellent opportunity for using machine learning (ML) techniques to train models to predict wind pressures around circular cylinders. Wind pressures around smooth circular cylinders are a function of mainly the Reynolds number (Re), turbulence intensity (Ti) of the incident wind, and circumferential angle of the cylinder. Considering these three parameters as the inputs, this study trained two ML models to predict mean and fluctuating pressures respectively. Three machine learning algorithms including decision tree regressor, random forest, and gradient boosting regression trees (GBRT) were tested. The GBRT models exhibited the best performance for predicting both mean and fluctuating pressures, and they are capable of making accurate predictions for Re ranging from 10^4 to 10^6 and Ti ranging from 0% to 15%. It is believed that the GBRT models provide very efficient and economical alternative to traditional wind tunnel tests and computational fluid dynamic simulations for determining wind pressures around smooth circular cylinders within the studied Re and Ti range.
△ Less
Submitted 20 January, 2019;
originally announced January 2019.
-
BACH: Grand Challenge on Breast Cancer Histology Images
Authors:
Guilherme Aresta,
Teresa Araújo,
Scotty Kwok,
Sai Saketh Chennamsetty,
Mohammed Safwan,
Varghese Alex,
Bahram Marami,
Marcel Prastawa,
Monica Chan,
Michael Donovan,
Gerardo Fernandez,
Jack Zeineh,
Matthias Kohl,
Christoph Walz,
Florian Ludwig,
Stefan Braunewell,
Maximilian Baust,
Quoc Dang Vu,
Minh Nguyen Nhat To,
Eal Kim,
Jin Tae Kwak,
Sameh Galal,
Veronica Sanchez-Freire,
Nadia Brancati,
Maria Frucci
, et al. (11 additional authors not shown)
Abstract:
Breast cancer is the most common invasive cancer in women, affecting more than 10% of women worldwide. Microscopic analysis of a biopsy remains one of the most important methods to diagnose the type of breast cancer. This requires specialized analysis by pathologists, in a task that i) is highly time- and cost-consuming and ii) often leads to nonconsensual results. The relevance and potential of a…
▽ More
Breast cancer is the most common invasive cancer in women, affecting more than 10% of women worldwide. Microscopic analysis of a biopsy remains one of the most important methods to diagnose the type of breast cancer. This requires specialized analysis by pathologists, in a task that i) is highly time- and cost-consuming and ii) often leads to nonconsensual results. The relevance and potential of automatic classification algorithms using hematoxylin-eosin stained histopathological images has already been demonstrated, but the reported results are still sub-optimal for clinical use. With the goal of advancing the state-of-the-art in automatic classification, the Grand Challenge on BreAst Cancer Histology images (BACH) was organized in conjunction with the 15th International Conference on Image Analysis and Recognition (ICIAR 2018). A large annotated dataset, composed of both microscopy and whole-slide images, was specifically compiled and made publicly available for the BACH challenge. Following a positive response from the scientific community, a total of 64 submissions, out of 677 registrations, effectively entered the competition. From the submitted algorithms it was possible to push forward the state-of-the-art in terms of accuracy (87%) in automatic classification of breast cancer with histopathological images. Convolutional neuronal networks were the most successful methodology in the BACH challenge. Detailed analysis of the collective results allowed the identification of remaining challenges in the field and recommendations for future developments. The BACH dataset remains publically available as to promote further improvements to the field of automatic classification in digital pathology.
△ Less
Submitted 17 June, 2019; v1 submitted 13 August, 2018;
originally announced August 2018.
-
Multiple decision trees
Authors:
Suk Wah Kwok,
Chris Carter
Abstract:
This paper describes experiments, on two domains, to investigate the effect of averaging over predictions of multiple decision trees, instead of using a single tree. Other authors have pointed out theoretical and commonsense reasons for preferring the multiple tree approach. Ideally, we would like to consider predictions from all trees, weighted by their probability. However, there is a vast nu…
▽ More
This paper describes experiments, on two domains, to investigate the effect of averaging over predictions of multiple decision trees, instead of using a single tree. Other authors have pointed out theoretical and commonsense reasons for preferring the multiple tree approach. Ideally, we would like to consider predictions from all trees, weighted by their probability. However, there is a vast number of different trees, and it is difficult to estimate the probability of each tree. We sidestep the estimation problem by using a modified version of the ID3 algorithm to build good trees, and average over only these trees. Our results are encouraging. For each domain, we managed to produce a small number of good trees. We find that it is best to average across sets of trees with different structure; this usually gives better performance than any of the constituent trees, including the ID3 tree.
△ Less
Submitted 27 March, 2013;
originally announced April 2013.
-
A Fast Image Encryption Scheme based on Chaotic Standard Map
Authors:
Kwok-Wo Wong,
Bernie Sin-Hung Kwok,
Wing-Shing Law
Abstract:
In recent years, a variety of effective chaos-based image encryption schemes have been proposed. The typical structure of these schemes has the permutation and the diffusion stages performed alternatively. The confusion and diffusion effect is solely contributed by the permutation and the diffusion stage, respectively. As a result, more overall rounds than necessary are required to achieve a cer…
▽ More
In recent years, a variety of effective chaos-based image encryption schemes have been proposed. The typical structure of these schemes has the permutation and the diffusion stages performed alternatively. The confusion and diffusion effect is solely contributed by the permutation and the diffusion stage, respectively. As a result, more overall rounds than necessary are required to achieve a certain level of security. In this paper, we suggest to introduce certain diffusion effect in the confusion stage by simple sequential add-and-shift operations. The purpose is to reduce the workload of the time-consuming diffusion part so that fewer overall rounds and hence a shorter encryption time is needed. Simulation results show that at a similar performance level, the proposed cryptosystem needs less than one-third the encryption time of an existing cryptosystem. The effective acceleration of the encryption speed is thus achieved.
△ Less
Submitted 29 September, 2006;
originally announced September 2006.