-
Whence Is A Model Fair? Fixing Fairness Bugs via Propensity Score Matching
Authors:
Kewen Peng,
Yicheng Yang,
Hao Zhuo
Abstract:
Fairness-aware learning aims to mitigate discrimination against specific protected social groups (e.g., those categorized by gender, ethnicity, age) while minimizing predictive performance loss. Despite efforts to improve fairness in machine learning, prior studies have shown that many models remain unfair when measured against various fairness metrics. In this paper, we examine whether the way tr…
▽ More
Fairness-aware learning aims to mitigate discrimination against specific protected social groups (e.g., those categorized by gender, ethnicity, age) while minimizing predictive performance loss. Despite efforts to improve fairness in machine learning, prior studies have shown that many models remain unfair when measured against various fairness metrics. In this paper, we examine whether the way training and testing data are sampled affects the reliability of reported fairness metrics. Since training and test sets are often randomly sampled from the same population, bias present in the training data may still exist in the test data, potentially skewing fairness assessments. To address this, we propose FairMatch, a post-processing method that applies propensity score matching to evaluate and mitigate bias. FairMatch identifies control and treatment pairs with similar propensity scores in the test set and adjusts decision thresholds for different subgroups accordingly. For samples that cannot be matched, we perform probabilistic calibration using fairness-aware loss functions. Experimental results demonstrate that our approach can (a) precisely locate subsets of the test data where the model is unbiased, and (b) significantly reduce bias on the remaining data. Overall, propensity score matching offers a principled way to improve both fairness evaluation and mitigation, without sacrificing predictive performance.
△ Less
Submitted 1 May, 2025; v1 submitted 23 April, 2025;
originally announced April 2025.
-
Combating Toxic Language: A Review of LLM-Based Strategies for Software Engineering
Authors:
Hao Zhuo,
Yicheng Yang,
Kewen Peng
Abstract:
Large Language Models (LLMs) have become integral to software engineering (SE), where they are increasingly used in development workflows. However, their widespread use raises concerns about the presence and propagation of toxic language--harmful or offensive content that can foster exclusionary environments. This paper provides a comprehensive review of recent research on toxicity detection and m…
▽ More
Large Language Models (LLMs) have become integral to software engineering (SE), where they are increasingly used in development workflows. However, their widespread use raises concerns about the presence and propagation of toxic language--harmful or offensive content that can foster exclusionary environments. This paper provides a comprehensive review of recent research on toxicity detection and mitigation, focusing on both SE-specific and general-purpose datasets. We examine annotation and preprocessing techniques, assess detection methodologies, and evaluate mitigation strategies, particularly those leveraging LLMs. Additionally, we conduct an ablation study demonstrating the effectiveness of LLM-based rewriting for reducing toxicity. By synthesizing existing work and identifying open challenges, this review highlights key areas for future research to ensure the responsible deployment of LLMs in SE and beyond.
△ Less
Submitted 21 April, 2025;
originally announced April 2025.
-
Software Engineering Principles for Fairer Systems: Experiments with GroupCART
Authors:
Kewen Peng,
Hao Zhuo,
Yicheng Yang,
Tim Menzies
Abstract:
Discrimination-aware classification aims to make accurate predictions while satisfying fairness constraints. Traditional decision tree learners typically optimize for information gain in the target attribute alone, which can result in models that unfairly discriminate against protected social groups (e.g., gender, ethnicity). Motivated by these shortcomings, we propose GroupCART, a tree-based ense…
▽ More
Discrimination-aware classification aims to make accurate predictions while satisfying fairness constraints. Traditional decision tree learners typically optimize for information gain in the target attribute alone, which can result in models that unfairly discriminate against protected social groups (e.g., gender, ethnicity). Motivated by these shortcomings, we propose GroupCART, a tree-based ensemble optimizer that avoids bias during model construction by optimizing not only for decreased entropy in the target attribute but also for increased entropy in protected attributes. Our experiments show that GroupCART achieves fairer models without data transformation and with minimal performance degradation. Furthermore, the method supports customizable weighting, offering a smooth and flexible trade-off between predictive performance and fairness based on user requirements. These results demonstrate that algorithmic bias in decision tree models can be mitigated through multi-task, fairness-aware learning. All code and datasets used in this study are available at: https://github.com/anonymous12138/groupCART.
△ Less
Submitted 16 April, 2025;
originally announced April 2025.
-
Efficiently Laser Driven Terahertz Surface Plasmon Polaritons on Long Metal Wire
Authors:
Shuoting Shao,
Xiangbing Wang,
Rong Huang,
Guangyue Hu,
Min Chen,
Huibo Tang,
Longyu Kuang,
Yuxi Liu,
Yuqiu Gu,
Yongkun Ding,
Ruxin Li,
Hongbin Zhuo,
Mingyang Yu
Abstract:
We experimentally demonstrate a novel scheme for efficiently generating intense terahertz (THz) surface plasmon polaritons (SPPs) on a sub-wavelength-diameter meter-long metal wire. Driven by a subrelativistic femtosecond laser (a0=0.3, 3 mJ) focused at the wire's midpoint, single-cycle ten-megawatt THz SPPs are excited and propagating bidirectionally along it over 25 cm. The measured laser-to-SPP…
▽ More
We experimentally demonstrate a novel scheme for efficiently generating intense terahertz (THz) surface plasmon polaritons (SPPs) on a sub-wavelength-diameter meter-long metal wire. Driven by a subrelativistic femtosecond laser (a0=0.3, 3 mJ) focused at the wire's midpoint, single-cycle ten-megawatt THz SPPs are excited and propagating bidirectionally along it over 25 cm. The measured laser-to-SPPs energy conversion efficiency is reaching up to ~2.4%, which is the highest value at present. It is proved that the THz SPPs are excited by coherent transition radiation of the subrelativistic laser produced escaping electrons. Particle-in-cell together with CST simulations confirm the experimental observations. Our scheme of using readily available subrelativistic laser should thus be useful to applications requiring terawatt level single-cycle THz SPPs.
△ Less
Submitted 21 February, 2025; v1 submitted 11 February, 2025;
originally announced February 2025.
-
On the Roles of LLMs in Planning: Embedding LLMs into Planning Graphs
Authors:
Hankz Hankui Zhuo,
Xin Chen,
Rong Pan
Abstract:
Plan synthesis aims to generate a course of actions or policies to transit given initial states to goal states, provided domain models that could be designed by experts or learnt from training data or interactions with the world. Intrigued by the claims of emergent planning capabilities in large language models (LLMs), works have been proposed to investigate the planning effectiveness of LLMs, wit…
▽ More
Plan synthesis aims to generate a course of actions or policies to transit given initial states to goal states, provided domain models that could be designed by experts or learnt from training data or interactions with the world. Intrigued by the claims of emergent planning capabilities in large language models (LLMs), works have been proposed to investigate the planning effectiveness of LLMs, without considering any utilization of off-the-shelf planning techniques in LLMs. In this paper, we aim to further study the insight of the planning capability of LLMs by investigating the roles of LLMs in off-the-shelf planning frameworks. To do this, we investigate the effectiveness of embedding LLMs into one of the well-known planning frameworks, graph-based planning, proposing a novel LLMs-based planning framework with LLMs embedded in two levels of planning graphs, i.e., mutual constraints generation level and constraints solving level. We empirically exhibit the effectiveness of our proposed framework in various planning domains.
△ Less
Submitted 26 July, 2024; v1 submitted 18 February, 2024;
originally announced March 2024.
-
BalMCTS: Balancing Objective Function and Search Nodes in MCTS for Constraint Optimization Problems
Authors:
Yingkai Xiao,
Jingjin Liu,
Hankz Hankui Zhuo
Abstract:
Constraint Optimization Problems (COP) pose intricate challenges in combinatorial problems usually addressed through Branch and Bound (B\&B) methods, which involve maintaining priority queues and iteratively selecting branches to search for solutions. However, conventional approaches take a considerable amount of time to find optimal solutions, and it is also crucial to quickly identify a near-opt…
▽ More
Constraint Optimization Problems (COP) pose intricate challenges in combinatorial problems usually addressed through Branch and Bound (B\&B) methods, which involve maintaining priority queues and iteratively selecting branches to search for solutions. However, conventional approaches take a considerable amount of time to find optimal solutions, and it is also crucial to quickly identify a near-optimal feasible solution in a shorter time. In this paper, we aim to investigate the effectiveness of employing a depth-first search algorithm for solving COP, specifically focusing on identifying optimal or near-optimal solutions within top $n$ solutions. Hence, we propose a novel heuristic neural network algorithm based on MCTS, which, by simultaneously conducting search and training, enables the neural network to effectively serve as a heuristic during Backtracking. Furthermore, our approach incorporates encoding COP problems and utilizing graph neural networks to aggregate information about variables and constraints, offering more appropriate variables for assignments. Experimental results on stochastic COP instances demonstrate that our method identifies feasible solutions with a gap of less than 17.63% within the initial 5 feasible solutions. Moreover, when applied to attendant Constraint Satisfaction Problem (CSP) instances, our method exhibits a remarkable reduction of less than 5% in searching nodes compared to state-of-the-art approaches.
△ Less
Submitted 25 December, 2023;
originally announced December 2023.
-
Planning with Logical Graph-based Language Model for Instruction Generation
Authors:
Fan Zhang,
Kebing Jin,
Hankz Hankui Zhuo
Abstract:
Despite the superior performance of large language models to generate natural language texts, it is hard to generate texts with correct logic according to a given task, due to the difficulties for neural models to capture implied rules from free-form texts. In this paper, we propose a novel graph-based language model, Logical-GLM, to infuse logic into language models for more valid text generation…
▽ More
Despite the superior performance of large language models to generate natural language texts, it is hard to generate texts with correct logic according to a given task, due to the difficulties for neural models to capture implied rules from free-form texts. In this paper, we propose a novel graph-based language model, Logical-GLM, to infuse logic into language models for more valid text generation and interpretability. Specifically, we first capture information from natural language instructions and construct logical bayes graphs that generally describe domains. Next, we generate logical skeletons to guide language model training, infusing domain knowledge into language models. Finally, we alternately optimize the searching policy of graphs and language models until convergence. The experimental results show that Logical-GLM is both effective and efficient compared with traditional language models, despite using smaller-scale training data and fewer parameters. Our approach can generate instructional texts with more correct logic owing to the internalized domain knowledge. Moreover, the usage of logical graphs reflects the inner mechanism of the language models, which improves the interpretability of black-box models.
△ Less
Submitted 5 July, 2024; v1 submitted 26 August, 2023;
originally announced August 2023.
-
DPBERT: Efficient Inference for BERT based on Dynamic Planning
Authors:
Weixin Wu,
Hankz Hankui Zhuo
Abstract:
Large-scale pre-trained language models such as BERT have contributed significantly to the development of NLP. However, those models require large computational resources, making it difficult to be applied to mobile devices where computing power is limited. In this paper we aim to address the weakness of existing input-adaptive inference methods which fail to take full advantage of the structure o…
▽ More
Large-scale pre-trained language models such as BERT have contributed significantly to the development of NLP. However, those models require large computational resources, making it difficult to be applied to mobile devices where computing power is limited. In this paper we aim to address the weakness of existing input-adaptive inference methods which fail to take full advantage of the structure of BERT. We propose Dynamic Planning in BERT, a novel fine-tuning strategy that can accelerate the inference process of BERT through selecting a subsequence of transformer layers list of backbone as a computational path for an input sample. To do this, our approach adds a planning module to the original BERT model to determine whether a layer is included or bypassed during inference. Experimental results on the GLUE benchmark exhibit that our method reduces latency to 75\% while maintaining 98\% accuracy, yielding a better accuracy-speed trade-off compared to state-of-the-art input-adaptive methods.
△ Less
Submitted 26 July, 2023;
originally announced August 2023.
-
Hierarchical Task Network Planning for Facilitating Cooperative Multi-Agent Reinforcement Learning
Authors:
Xuechen Mu,
Hankz Hankui Zhuo,
Chen Chen,
Kai Zhang,
Chao Yu,
Jianye Hao
Abstract:
Exploring sparse reward multi-agent reinforcement learning (MARL) environments with traps in a collaborative manner is a complex task. Agents typically fail to reach the goal state and fall into traps, which affects the overall performance of the system. To overcome this issue, we present SOMARL, a framework that uses prior knowledge to reduce the exploration space and assist learning. In SOMARL,…
▽ More
Exploring sparse reward multi-agent reinforcement learning (MARL) environments with traps in a collaborative manner is a complex task. Agents typically fail to reach the goal state and fall into traps, which affects the overall performance of the system. To overcome this issue, we present SOMARL, a framework that uses prior knowledge to reduce the exploration space and assist learning. In SOMARL, agents are treated as part of the MARL environment, and symbolic knowledge is embedded using a tree structure to build a knowledge hierarchy. The framework has a two-layer hierarchical structure, comprising a hybrid module with a Hierarchical Task Network (HTN) planning and meta-controller at the higher level, and a MARL-based interactive module at the lower level. The HTN module and meta-controller use Hierarchical Domain Definition Language (HDDL) and the option framework to formalize symbolic knowledge and obtain domain knowledge and a symbolic option set, respectively. Moreover, the HTN module leverages domain knowledge to guide low-level agent exploration by assisting the meta-controller in selecting symbolic options. The meta-controller further computes intrinsic rewards of symbolic options to limit exploration behavior and adjust HTN planning solutions as needed. We evaluate SOMARL on two benchmarks, FindTreasure and MoveBox, and report superior performance over state-of-the-art MARL and subgoal-based baselines for MARL environments significantly.
△ Less
Submitted 14 June, 2023;
originally announced June 2023.
-
Sequential Condition Evolved Interaction Knowledge Graph for Traditional Chinese Medicine Recommendation
Authors:
Jingjin Liu,
Hankz Hankui Zhuo,
Kebing Jin,
Jiamin Yuan,
Zhimin Yang,
Zhengan Yao
Abstract:
Traditional Chinese Medicine (TCM) has a rich history of utilizing natural herbs to treat a diversity of illnesses. In practice, TCM diagnosis and treatment are highly personalized and organically holistic, requiring comprehensive consideration of the patient's state and symptoms over time. However, existing TCM recommendation approaches overlook the changes in patient status and only explore pote…
▽ More
Traditional Chinese Medicine (TCM) has a rich history of utilizing natural herbs to treat a diversity of illnesses. In practice, TCM diagnosis and treatment are highly personalized and organically holistic, requiring comprehensive consideration of the patient's state and symptoms over time. However, existing TCM recommendation approaches overlook the changes in patient status and only explore potential patterns between symptoms and prescriptions. In this paper, we propose a novel Sequential Condition Evolved Interaction Knowledge Graph (SCEIKG), a framework that treats the model as a sequential prescription-making problem by considering the dynamics of the patient's condition across multiple visits. In addition, we incorporate an interaction knowledge graph to enhance the accuracy of recommendations by considering the interactions between different herbs and the patient's condition. Experimental results on a real-world dataset demonstrate that our approach outperforms existing TCM recommendation methods, achieving state-of-the-art performance.
△ Less
Submitted 6 October, 2023; v1 submitted 28 May, 2023;
originally announced May 2023.
-
XRoute Environment: A Novel Reinforcement Learning Environment for Routing
Authors:
Zhanwen Zhou,
Hankz Hankui Zhuo,
Xiaowu Zhang,
Qiyuan Deng
Abstract:
Routing is a crucial and time-consuming stage in modern design automation flow for advanced technology nodes. Great progress in the field of reinforcement learning makes it possible to use those approaches to improve the routing quality and efficiency. However, the scale of the routing problems solved by reinforcement learning-based methods in recent studies is too small for these methods to be us…
▽ More
Routing is a crucial and time-consuming stage in modern design automation flow for advanced technology nodes. Great progress in the field of reinforcement learning makes it possible to use those approaches to improve the routing quality and efficiency. However, the scale of the routing problems solved by reinforcement learning-based methods in recent studies is too small for these methods to be used in commercial EDA tools. We introduce the XRoute Environment, a new reinforcement learning environment where agents are trained to select and route nets in an advanced, end-to-end routing framework. Novel algorithms and ideas can be quickly tested in a safe and reproducible manner in it. The resulting environment is challenging, easy to use, customize and add additional scenarios, and it is available under a permissive open-source license. In addition, it provides support for distributed deployment and multi-instance experiments. We propose two tasks for learning and build a full-chip test bed with routing benchmarks of various region sizes. We also pre-define several static routing regions with different pin density and number of nets for easier learning and testing. For net ordering task, we report baseline results for two widely used reinforcement learning algorithms (PPO and DQN) and one searching-based algorithm (TritonRoute). The XRoute Environment will be available at https://github.com/xplanlab/xroute_env.
△ Less
Submitted 5 June, 2023; v1 submitted 23 May, 2023;
originally announced May 2023.
-
Branching of high-current relativistic electron beam in porous materials
Authors:
K. Jiang,
T. W. Huang,
R. Li,
M. Y. Yu,
H. B. Zhuo,
S. Z. Wu,
C. T. Zhou,
S. C. Ruan
Abstract:
Propagation of high-current relativistic electron beam (REB) in plasma is relevant to many high-energy astrophysical phenomena as well as applications based on high-intensity lasers and charged-particle beams. Here we report a new regime of beam-plasma interaction arising from REB propagation in medium with fine structures. In this regime, the REB cascades into thin branches with local density hun…
▽ More
Propagation of high-current relativistic electron beam (REB) in plasma is relevant to many high-energy astrophysical phenomena as well as applications based on high-intensity lasers and charged-particle beams. Here we report a new regime of beam-plasma interaction arising from REB propagation in medium with fine structures. In this regime, the REB cascades into thin branches with local density hundred times the initial value and deposits its energy two orders of magnitude more efficiently than that in homogeneous plasma, where REB branching does not occur, of similar average density. Such beam branching can be attributed to successive weak scatterings of the beam electrons by the unevenly distributed magnetic fields induced by the local return currents in the skeletons of the porous medium. Results from a model for the excitation conditions and location of the first branching point with respect to the medium and beam parameters agree well with that from pore-resolved particle-in-cell simulations.
△ Less
Submitted 5 May, 2023;
originally announced May 2023.
-
Reinforcement Learning with Knowledge Representation and Reasoning: A Brief Survey
Authors:
Chao Yu,
Shicheng Ye,
Hankz Hankui Zhuo
Abstract:
Reinforcement Learning (RL) has achieved tremendous development in recent years, but still faces significant obstacles in addressing complex real-life problems due to the issues of poor system generalization, low sample efficiency as well as safety and interpretability concerns. The core reason underlying such dilemmas can be attributed to the fact that most of the work has focused on the computat…
▽ More
Reinforcement Learning (RL) has achieved tremendous development in recent years, but still faces significant obstacles in addressing complex real-life problems due to the issues of poor system generalization, low sample efficiency as well as safety and interpretability concerns. The core reason underlying such dilemmas can be attributed to the fact that most of the work has focused on the computational aspect of value functions or policies using a representational model to describe atomic components of rewards, states and actions etc, thus neglecting the rich high-level declarative domain knowledge of facts, relations and rules that can be either provided a priori or acquired through reasoning over time. Recently, there has been a rapidly growing interest in the use of Knowledge Representation and Reasoning (KRR) methods, usually using logical languages, to enable more abstract representation and efficient learning in RL. In this survey, we provide a preliminary overview on these endeavors that leverage the strengths of KRR to help solving various problems in RL, and discuss the challenging open problems and possible directions for future work in this area.
△ Less
Submitted 23 February, 2025; v1 submitted 24 April, 2023;
originally announced April 2023.
-
Models as Agents: Optimizing Multi-Step Predictions of Interactive Local Models in Model-Based Multi-Agent Reinforcement Learning
Authors:
Zifan Wu,
Chao Yu,
Chen Chen,
Jianye Hao,
Hankz Hankui Zhuo
Abstract:
Research in model-based reinforcement learning has made significant progress in recent years. Compared to single-agent settings, the exponential dimension growth of the joint state-action space in multi-agent systems dramatically increases the complexity of the environment dynamics, which makes it infeasible to learn an accurate global model and thus necessitates the use of agent-wise local models…
▽ More
Research in model-based reinforcement learning has made significant progress in recent years. Compared to single-agent settings, the exponential dimension growth of the joint state-action space in multi-agent systems dramatically increases the complexity of the environment dynamics, which makes it infeasible to learn an accurate global model and thus necessitates the use of agent-wise local models. However, during multi-step model rollouts, the prediction of one local model can affect the predictions of other local models in the next step. As a result, local prediction errors can be propagated to other localities and eventually give rise to considerably large global errors. Furthermore, since the models are generally used to predict for multiple steps, simply minimizing one-step prediction errors regardless of their long-term effect on other models may further aggravate the propagation of local errors. To this end, we propose Models as AGents (MAG), a multi-agent model optimization framework that reversely treats the local models as multi-step decision making agents and the current policies as the dynamics during the model rollout process. In this way, the local models are able to consider the multi-step mutual affect between each other before making predictions. Theoretically, we show that the objective of MAG is approximately equivalent to maximizing a lower bound of the true environment return. Experiments on the challenging StarCraft II benchmark demonstrate the effectiveness of MAG.
△ Less
Submitted 31 March, 2023;
originally announced March 2023.
-
Plan To Predict: Learning an Uncertainty-Foreseeing Model for Model-Based Reinforcement Learning
Authors:
Zifan Wu,
Chao Yu,
Chen Chen,
Jianye Hao,
Hankz Hankui Zhuo
Abstract:
In Model-based Reinforcement Learning (MBRL), model learning is critical since an inaccurate model can bias policy learning via generating misleading samples. However, learning an accurate model can be difficult since the policy is continually updated and the induced distribution over visited states used for model learning shifts accordingly. Prior methods alleviate this issue by quantifying the u…
▽ More
In Model-based Reinforcement Learning (MBRL), model learning is critical since an inaccurate model can bias policy learning via generating misleading samples. However, learning an accurate model can be difficult since the policy is continually updated and the induced distribution over visited states used for model learning shifts accordingly. Prior methods alleviate this issue by quantifying the uncertainty of model-generated samples. However, these methods only quantify the uncertainty passively after the samples were generated, rather than foreseeing the uncertainty before model trajectories fall into those highly uncertain regions. The resulting low-quality samples can induce unstable learning targets and hinder the optimization of the policy. Moreover, while being learned to minimize one-step prediction errors, the model is generally used to predict for multiple steps, leading to a mismatch between the objectives of model learning and model usage. To this end, we propose \emph{Plan To Predict} (P2P), an MBRL framework that treats the model rollout process as a sequential decision making problem by reversely considering the model as a decision maker and the current policy as the dynamics. In this way, the model can quickly adapt to the current policy and foresee the multi-step future uncertainty when generating trajectories. Theoretically, we show that the performance of P2P can be guaranteed by approximately optimizing a lower bound of the true environment return. Empirical results demonstrate that P2P achieves state-of-the-art performance on several challenging benchmark tasks.
△ Less
Submitted 20 January, 2023;
originally announced January 2023.
-
A Hierarchical Temporal Planning-Based Approach for Dynamic Hoist Scheduling Problems
Authors:
Kebing Jin,
Yingkai Xiao,
Hankz Hankui Zhuo,
Renyong Ma
Abstract:
Hoist scheduling has become a bottleneck in electroplating industry applications with the development of autonomous devices. Although there are a few approaches proposed to target at the challenging problem, they generally cannot scale to large-scale scheduling problems. In this paper, we formulate the hoist scheduling problem as a new temporal planning problem in the form of adapted PDDL, and pro…
▽ More
Hoist scheduling has become a bottleneck in electroplating industry applications with the development of autonomous devices. Although there are a few approaches proposed to target at the challenging problem, they generally cannot scale to large-scale scheduling problems. In this paper, we formulate the hoist scheduling problem as a new temporal planning problem in the form of adapted PDDL, and propose a novel hierarchical temporal planning approach to efficiently solve the scheduling problem. Additionally, we provide a collection of real-life benchmark instances that can be used to evaluate solution methods for the problem. We exhibit that the proposed approach is able to efficiently find solutions of high quality for large-scale real-life benchmark instances, with comparison to state-of-the-art baselines.
△ Less
Submitted 11 December, 2022;
originally announced December 2022.
-
Learning Visual Planning Models from Partially Observed Images
Authors:
Kebing Jin,
Zhanhao Xiao,
Hankui Hankz Zhuo,
Hai Wan,
Jiaran Cai
Abstract:
There has been increasing attention on planning model learning in classical planning. Most existing approaches, however, focus on learning planning models from structured data in symbolic representations. It is often difficult to obtain such structured data in real-world scenarios. Although a number of approaches have been developed for learning planning models from fully observed unstructured dat…
▽ More
There has been increasing attention on planning model learning in classical planning. Most existing approaches, however, focus on learning planning models from structured data in symbolic representations. It is often difficult to obtain such structured data in real-world scenarios. Although a number of approaches have been developed for learning planning models from fully observed unstructured data (e.g., images), in many scenarios raw observations are often incomplete. In this paper, we provide a novel framework, \aType{Recplan}, for learning a transition model from partially observed raw image traces. More specifically, by considering the preceding and subsequent images in a trace, we learn the latent state representations of raw observations and then build a transition model based on such representations. Additionally, we propose a neural-network-based approach to learn a heuristic model that estimates the distance toward a given goal observation. Based on the learned transition model and heuristic model, we implement a classical planner for images. We exhibit empirically that our approach is more effective than a state-of-the-art approach of learning visual planning models in the environment with incomplete observations.
△ Less
Submitted 25 November, 2022;
originally announced November 2022.
-
Text-Based Action-Model Acquisition for Planning
Authors:
Kebing Jin,
Huaixun Chen,
Hankz Hankui Zhuo
Abstract:
Although there have been approaches that are capable of learning action models from plan traces, there is no work on learning action models from textual observations, which is pervasive and much easier to collect from real-world applications compared to plan traces. In this paper we propose a novel approach to learning action models from natural language texts by integrating Constraint Satisfactio…
▽ More
Although there have been approaches that are capable of learning action models from plan traces, there is no work on learning action models from textual observations, which is pervasive and much easier to collect from real-world applications compared to plan traces. In this paper we propose a novel approach to learning action models from natural language texts by integrating Constraint Satisfaction and Natural Language Processing techniques. Specifically, we first build a novel language model to extract plan traces from texts, and then build a set of constraints to generate action models based on the extracted plan traces. After that, we iteratively improve the language model and constraints until we achieve the convergent language model and action models. We empirically exhibit that our approach is both effective and efficient.
△ Less
Submitted 17 February, 2022; v1 submitted 14 February, 2022;
originally announced February 2022.
-
Integrating AI Planning with Natural Language Processing: A Combination of Explicit and Tacit Knowledge
Authors:
Kebing Jin,
Hankz Hankui Zhuo
Abstract:
Natural language processing (NLP) aims at investigating the interactions between agents and humans, processing and analyzing large amounts of natural language data. Large-scale language models play an important role in current natural language processing. However, the challenges of explainability and complexity come along with the developments of language models. One way is to introduce logical re…
▽ More
Natural language processing (NLP) aims at investigating the interactions between agents and humans, processing and analyzing large amounts of natural language data. Large-scale language models play an important role in current natural language processing. However, the challenges of explainability and complexity come along with the developments of language models. One way is to introduce logical relations and rules into natural language processing models, such as making use of Automated Planning. Automated planning (AI planning) focuses on building symbolic domain models and synthesizing plans to transit initial states to goals based on domain models. Recently, there have been plenty of works related to these two fields, which have the abilities to generate explicit knowledge, e.g., preconditions and effects of action models, and learn from tacit knowledge, e.g., neural models, respectively. Integrating AI planning and natural language processing effectively improves the communication between human and intelligent agents. This paper outlines the commons and relations between AI planning and natural language processing, argues that each of them can effectively impact on the other one by five areas: (1) planning-based text understanding, (2) planning-based natural language processing, (3) planning-based explainability, (4) text-based human-robot interaction, and (5) applications. We also explore some potential future issues between AI planning and natural language processing. To the best of our knowledge, this survey is the first work that addresses the deep connections between AI planning and Natural language processing.
△ Less
Submitted 13 April, 2023; v1 submitted 14 February, 2022;
originally announced February 2022.
-
Introduction to The Dynamic Pickup and Delivery Problem Benchmark -- ICAPS 2021 Competition
Authors:
Jianye Hao,
Jiawen Lu,
Xijun Li,
Xialiang Tong,
Xiang Xiang,
Mingxuan Yuan,
Hankz Hankui Zhuo
Abstract:
The Dynamic Pickup and Delivery Problem (DPDP) is an essential problem within the logistics domain. So far, research on this problem has mainly focused on using artificial data which fails to reflect the complexity of real-world problems. In this draft, we would like to introduce a new benchmark from real business scenarios as well as a simulator supporting the dynamic evaluation. The benchmark an…
▽ More
The Dynamic Pickup and Delivery Problem (DPDP) is an essential problem within the logistics domain. So far, research on this problem has mainly focused on using artificial data which fails to reflect the complexity of real-world problems. In this draft, we would like to introduce a new benchmark from real business scenarios as well as a simulator supporting the dynamic evaluation. The benchmark and simulator have been published and successfully supported the ICAPS 2021 Dynamic Pickup and Delivery Problem competition participated by 152 teams.
△ Less
Submitted 18 January, 2022;
originally announced February 2022.
-
Creativity of AI: Hierarchical Planning Model Learning for Facilitating Deep Reinforcement Learning
Authors:
Hankz Hankui Zhuo,
Shuting Deng,
Mu Jin,
Zhihao Ma,
Kebing Jin,
Chen Chen,
Chao Yu
Abstract:
Despite of achieving great success in real-world applications, Deep Reinforcement Learning (DRL) is still suffering from three critical issues, i.e., data efficiency, lack of the interpretability and transferability. Recent research shows that embedding symbolic knowledge into DRL is promising in addressing those challenges. Inspired by this, we introduce a novel deep reinforcement learning framew…
▽ More
Despite of achieving great success in real-world applications, Deep Reinforcement Learning (DRL) is still suffering from three critical issues, i.e., data efficiency, lack of the interpretability and transferability. Recent research shows that embedding symbolic knowledge into DRL is promising in addressing those challenges. Inspired by this, we introduce a novel deep reinforcement learning framework with symbolic options. Our framework features a loop training procedure, which enables guiding the improvement of policy by planning with planning models (including action models and hierarchical task network models) and symbolic options learned from interactive trajectories automatically. The learned symbolic options alleviate the dense requirement of expert domain knowledge and provide inherent interpretability of policies. Moreover, the transferability and data efficiency can be further improved by planning with the symbolic planning models. To validate the effectiveness of our framework, we conduct experiments on two domains, Montezuma's Revenge and Office World, respectively. The results demonstrate the comparable performance, improved data efficiency, interpretability and transferability.
△ Less
Submitted 7 July, 2023; v1 submitted 17 December, 2021;
originally announced December 2021.
-
Retrosynthetic Planning with Experience-Guided Monte Carlo Tree Search
Authors:
Siqi Hong,
Hankz Hankui Zhuo,
Kebing Jin,
Guang Shao,
Zhanwen Zhou
Abstract:
In retrosynthetic planning, the huge number of possible routes to synthesize a complex molecule using simple building blocks leads to a combinatorial explosion of possibilities. Even experienced chemists often have difficulty to select the most promising transformations. The current approaches rely on human-defined or machine-trained score functions which have limited chemical knowledge or use exp…
▽ More
In retrosynthetic planning, the huge number of possible routes to synthesize a complex molecule using simple building blocks leads to a combinatorial explosion of possibilities. Even experienced chemists often have difficulty to select the most promising transformations. The current approaches rely on human-defined or machine-trained score functions which have limited chemical knowledge or use expensive estimation methods for guiding. Here we an propose experience-guided Monte Carlo tree search (EG-MCTS) to deal with this problem. Instead of rollout, we build an experience guidance network to learn knowledge from synthetic experiences during the search. Experiments on benchmark USPTO datasets show that, EG-MCTS gains significant improvement over state-of-the-art approaches both in efficiency and effectiveness. In a comparative experiment with the literature, our computer-generated routes mostly matched the reported routes. Routes designed for real drug compounds exhibit the effectiveness of EG-MCTS on assisting chemists performing retrosynthetic analysis.
△ Less
Submitted 9 June, 2023; v1 submitted 11 December, 2021;
originally announced December 2021.
-
Lifelong Reinforcement Learning with Temporal Logic Formulas and Reward Machines
Authors:
Xuejing Zheng,
Chao Yu,
Chen Chen,
Jianye Hao,
Hankz Hankui Zhuo
Abstract:
Continuously learning new tasks using high-level ideas or knowledge is a key capability of humans. In this paper, we propose Lifelong reinforcement learning with Sequential linear temporal logic formulas and Reward Machines (LSRM), which enables an agent to leverage previously learned knowledge to fasten learning of logically specified tasks. For the sake of more flexible specification of tasks, w…
▽ More
Continuously learning new tasks using high-level ideas or knowledge is a key capability of humans. In this paper, we propose Lifelong reinforcement learning with Sequential linear temporal logic formulas and Reward Machines (LSRM), which enables an agent to leverage previously learned knowledge to fasten learning of logically specified tasks. For the sake of more flexible specification of tasks, we first introduce Sequential Linear Temporal Logic (SLTL), which is a supplement to the existing Linear Temporal Logic (LTL) formal language. We then utilize Reward Machines (RM) to exploit structural reward functions for tasks encoded with high-level events, and propose automatic extension of RM and efficient knowledge transfer over tasks for continuous learning in lifetime. Experimental results show that LSRM outperforms the methods that learn the target tasks from scratch by taking advantage of the task decomposition using SLTL and knowledge transfer over RM during the lifelong learning process.
△ Less
Submitted 17 November, 2021;
originally announced November 2021.
-
Coordinated Proximal Policy Optimization
Authors:
Zifan Wu,
Chao Yu,
Deheng Ye,
Junge Zhang,
Haiyin Piao,
Hankz Hankui Zhuo
Abstract:
We present Coordinated Proximal Policy Optimization (CoPPO), an algorithm that extends the original Proximal Policy Optimization (PPO) to the multi-agent setting. The key idea lies in the coordinated adaptation of step size during the policy update process among multiple agents. We prove the monotonicity of policy improvement when optimizing a theoretically-grounded joint objective, and derive a s…
▽ More
We present Coordinated Proximal Policy Optimization (CoPPO), an algorithm that extends the original Proximal Policy Optimization (PPO) to the multi-agent setting. The key idea lies in the coordinated adaptation of step size during the policy update process among multiple agents. We prove the monotonicity of policy improvement when optimizing a theoretically-grounded joint objective, and derive a simplified optimization objective based on a set of approximations. We then interpret that such an objective in CoPPO can achieve dynamic credit assignment among agents, thereby alleviating the high variance issue during the concurrent update of agent policies. Finally, we demonstrate that CoPPO outperforms several strong baselines and is competitive with the latest multi-agent PPO method (i.e. MAPPO) under typical multi-agent settings, including cooperative matrix games and the StarCraft II micromanagement tasks.
△ Less
Submitted 7 November, 2021;
originally announced November 2021.
-
Gradient-Based Mixed Planning with Symbolic and Numeric Action Parameters
Authors:
Kebing Jin,
Hankz Hankui Zhuo,
Zhanhao Xiao,
Hai Wan,
Subbarao Kambhampati
Abstract:
Dealing with planning problems with both logical relations and numeric changes in real-world dynamic environments is challenging. Existing numeric planning systems for the problem often discretize numeric variables or impose convex constraints on numeric variables, which harms the performance when solving problems. In this paper, we propose a novel algorithm framework to solve numeric planning pro…
▽ More
Dealing with planning problems with both logical relations and numeric changes in real-world dynamic environments is challenging. Existing numeric planning systems for the problem often discretize numeric variables or impose convex constraints on numeric variables, which harms the performance when solving problems. In this paper, we propose a novel algorithm framework to solve numeric planning problems mixed with logical relations and numeric changes based on gradient descent. We cast the numeric planning with logical relations and numeric changes as an optimization problem. Specifically, we extend syntax to allow parameters of action models to be either objects or real-valued numbers, which enhances the ability to model real-world numeric effects. Based on the extended modeling language, we propose a gradient-based framework to simultaneously optimize numeric parameters and compute appropriate actions to form candidate plans. The gradient-based framework is composed of an algorithmic heuristic module based on propositional operations to select actions and generate constraints for gradient descent, an algorithmic transition module to update states to next ones, and a loss module to compute loss. We repeatedly minimize loss by updating numeric parameters and compute candidate plans until it converges into a valid plan for the planning problem. In the empirical study, we exhibit that our algorithm framework is both effective and efficient in solving planning problems mixed with logical relations and numeric changes, especially when the problems contain obstacles and non-linear numeric effects.
△ Less
Submitted 9 October, 2022; v1 submitted 19 October, 2021;
originally announced October 2021.
-
Branched flow of intense laser light in plasma with uneven density distribution
Authors:
K. Jiang,
T. W. Huang,
C. N. Wu,
M. Y. Yu,
H. Zhang,
S. Z. Wu,
H. B. Zhuo,
A. Pukhov,
C. T. Zhou,
S. C. Ruan
Abstract:
Branched flow is an interesting phenomenon that can occur in diverse systems. It is usually linear in the sense that the flow does not alter the medium properties. Branched flow of light on thin films was recently discovered. A question of interest is thus if nonlinear branched flow of light can also occur. Here we found using particle-in-cell simulations that with intense laser propagating in pla…
▽ More
Branched flow is an interesting phenomenon that can occur in diverse systems. It is usually linear in the sense that the flow does not alter the medium properties. Branched flow of light on thin films was recently discovered. A question of interest is thus if nonlinear branched flow of light can also occur. Here we found using particle-in-cell simulations that with intense laser propagating in plasma with randomly uneven density distribution, photoionization by the laser can locally enhance the density variations along the laser paths and thus the branching of the laser. However, too-intense lasers can smooth the uneven electron density and suppress branching. The observed branching properties agree well with an analysis based on a Helmholtz equation for the laser electric field. Branched flow of intense laser in uneven plasma potentially opens up a new realm of intense laser-matter interaction.
△ Less
Submitted 15 May, 2022; v1 submitted 29 September, 2021;
originally announced September 2021.
-
Learning Symbolic Rules for Interpretable Deep Reinforcement Learning
Authors:
Zhihao Ma,
Yuzheng Zhuang,
Paul Weng,
Hankz Hankui Zhuo,
Dong Li,
Wulong Liu,
Jianye Hao
Abstract:
Recent progress in deep reinforcement learning (DRL) can be largely attributed to the use of neural networks. However, this black-box approach fails to explain the learned policy in a human understandable way. To address this challenge and improve the transparency, we propose a Neural Symbolic Reinforcement Learning framework by introducing symbolic logic into DRL. This framework features a fertil…
▽ More
Recent progress in deep reinforcement learning (DRL) can be largely attributed to the use of neural networks. However, this black-box approach fails to explain the learned policy in a human understandable way. To address this challenge and improve the transparency, we propose a Neural Symbolic Reinforcement Learning framework by introducing symbolic logic into DRL. This framework features a fertilization of reasoning and learning modules, enabling end-to-end learning with prior symbolic knowledge. Moreover, interpretability is achieved by extracting the logical rules learned by the reasoning module in a symbolic rule space. The experimental results show that our framework has better interpretability, along with competing performance in comparison to state-of-the-art approaches.
△ Less
Submitted 16 March, 2021; v1 submitted 15 March, 2021;
originally announced March 2021.
-
NTIRE 2020 Challenge on Real-World Image Super-Resolution: Methods and Results
Authors:
Andreas Lugmayr,
Martin Danelljan,
Radu Timofte,
Namhyuk Ahn,
Dongwoon Bai,
Jie Cai,
Yun Cao,
Junyang Chen,
Kaihua Cheng,
SeYoung Chun,
Wei Deng,
Mostafa El-Khamy,
Chiu Man Ho,
Xiaozhong Ji,
Amin Kheradmand,
Gwantae Kim,
Hanseok Ko,
Kanghyu Lee,
Jungwon Lee,
Hao Li,
Ziluan Liu,
Zhi-Song Liu,
Shuai Liu,
Yunhua Lu,
Zibo Meng
, et al. (21 additional authors not shown)
Abstract:
This paper reviews the NTIRE 2020 challenge on real world super-resolution. It focuses on the participating methods and final results. The challenge addresses the real world setting, where paired true high and low-resolution images are unavailable. For training, only one set of source input images is therefore provided along with a set of unpaired high-quality target images. In Track 1: Image Proc…
▽ More
This paper reviews the NTIRE 2020 challenge on real world super-resolution. It focuses on the participating methods and final results. The challenge addresses the real world setting, where paired true high and low-resolution images are unavailable. For training, only one set of source input images is therefore provided along with a set of unpaired high-quality target images. In Track 1: Image Processing artifacts, the aim is to super-resolve images with synthetically generated image processing artifacts. This allows for quantitative benchmarking of the approaches \wrt a ground-truth image. In Track 2: Smartphone Images, real low-quality smart phone images have to be super-resolved. In both tracks, the ultimate goal is to achieve the best perceptual quality, evaluated using a human study. This is the second challenge on the subject, following AIM 2019, targeting to advance the state-of-the-art in super-resolution. To measure the performance we use the benchmark protocol from AIM 2019. In total 22 teams competed in the final testing phase, demonstrating new and innovative solutions to the problem.
△ Less
Submitted 5 May, 2020;
originally announced May 2020.
-
Dual Graph Representation Learning
Authors:
Huiling Zhu,
Xin Luo,
Hankz Hankui Zhuo
Abstract:
Graph representation learning embeds nodes in large graphs as low-dimensional vectors and is of great benefit to many downstream applications. Most embedding frameworks, however, are inherently transductive and unable to generalize to unseen nodes or learn representations across different graphs. Although inductive approaches can generalize to unseen nodes, they neglect different contexts of nodes…
▽ More
Graph representation learning embeds nodes in large graphs as low-dimensional vectors and is of great benefit to many downstream applications. Most embedding frameworks, however, are inherently transductive and unable to generalize to unseen nodes or learn representations across different graphs. Although inductive approaches can generalize to unseen nodes, they neglect different contexts of nodes and cannot learn node embeddings dually. In this paper, we present a context-aware unsupervised dual encoding framework, \textbf{CADE}, to generate representations of nodes by combining real-time neighborhoods with neighbor-attentioned representation, and preserving extra memory of known nodes. We exhibit that our approach is effective by comparing to state-of-the-art methods.
△ Less
Submitted 24 February, 2020;
originally announced February 2020.
-
Refining HTN Methods via Task Insertion with Preferences
Authors:
Zhanhao Xiao,
Hai Wan,
Hankui Hankz Zhuo,
Andreas Herzig,
Laurent Perrussel,
Peilin Chen
Abstract:
Hierarchical Task Network (HTN) planning is showing its power in real-world planning. Although domain experts have partial hierarchical domain knowledge, it is time-consuming to specify all HTN methods, leaving them incomplete. On the other hand, traditional HTN learning approaches focus only on declarative goals, omitting the hierarchical domain knowledge. In this paper, we propose a novel learni…
▽ More
Hierarchical Task Network (HTN) planning is showing its power in real-world planning. Although domain experts have partial hierarchical domain knowledge, it is time-consuming to specify all HTN methods, leaving them incomplete. On the other hand, traditional HTN learning approaches focus only on declarative goals, omitting the hierarchical domain knowledge. In this paper, we propose a novel learning framework to refine HTN methods via task insertion with completely preserving the original methods. As it is difficult to identify incomplete methods without designating declarative goals for compound tasks, we introduce the notion of prioritized preference to capture the incompleteness possibility of methods. Specifically, the framework first computes the preferred completion profile w.r.t. the prioritized preference to refine the incomplete methods. Then it finds the minimal set of refined methods via a method substitution operation. Experimental analysis demonstrates that our approach is effective, especially in solving new HTN planning instances.
△ Less
Submitted 28 November, 2019;
originally announced November 2019.
-
Transfer Value Iteration Networks
Authors:
Junyi Shen,
Hankz Hankui Zhuo,
Jin Xu,
Bin Zhong,
Sinno Jialin Pan
Abstract:
Value iteration networks (VINs) have been demonstrated to have a good generalization ability for reinforcement learning tasks across similar domains. However, based on our experiments, a policy learned by VINs still fail to generalize well on the domain whose action space and feature space are not identical to those in the domain where it is trained. In this paper, we propose a transfer learning a…
▽ More
Value iteration networks (VINs) have been demonstrated to have a good generalization ability for reinforcement learning tasks across similar domains. However, based on our experiments, a policy learned by VINs still fail to generalize well on the domain whose action space and feature space are not identical to those in the domain where it is trained. In this paper, we propose a transfer learning approach on top of VINs, termed Transfer VINs (TVINs), such that a learned policy from a source domain can be generalized to a target domain with only limited training data, even if the source domain and the target domain have domain-specific actions and features. We empirically verify that our proposed TVINs outperform VINs when the source and the target domains have similar but not identical action and feature spaces. Furthermore, we show that the performance improvement is consistent across different environments, maze sizes, dataset sizes as well as different values of hyperparameters such as number of iteration and kernel size.
△ Less
Submitted 26 November, 2019; v1 submitted 11 November, 2019;
originally announced November 2019.
-
Repositioning Bikes with Carrier Vehicles and Bike Trailers in Bike Sharing Systems
Authors:
Xinghua Zheng,
Ming Tang,
Hankz Hankui Zhuo,
Kevin X. Wen
Abstract:
Bike Sharing Systems (BSSs) have been adopted in many major cities of the world due to traffic congestion and carbon emissions. Although there have been approaches to exploiting either bike trailers via crowdsourcing or carrier vehicles to reposition bikes in the ``right'' stations in the ``right'' time, they do not jointly consider the usage of both bike trailers and carrier vehicles. In this pap…
▽ More
Bike Sharing Systems (BSSs) have been adopted in many major cities of the world due to traffic congestion and carbon emissions. Although there have been approaches to exploiting either bike trailers via crowdsourcing or carrier vehicles to reposition bikes in the ``right'' stations in the ``right'' time, they do not jointly consider the usage of both bike trailers and carrier vehicles. In this paper, we aim to take advantage of both bike trailers and carrier vehicles to reduce the loss of demand with regard to the crowdsourcing of bike trailers and the fuel cost of carrier vehicles. In the experiment, we exhibit that our approach outperforms baselines in several datasets from bike sharing companies.
△ Less
Submitted 20 September, 2019;
originally announced September 2019.
-
Learning Action Models from Disordered and Noisy Plan Traces
Authors:
Hankz Hankui Zhuo,
Jing Peng,
Subbarao Kambhampati
Abstract:
There is increasing awareness in the planning community that the burden of specifying complete domain models is too high, which impedes the applicability of planning technology in many real-world domains. Although there have many learning systems that help automatically learning domain models, most existing work assumes that the input traces are completely correct. A more realistic situation is th…
▽ More
There is increasing awareness in the planning community that the burden of specifying complete domain models is too high, which impedes the applicability of planning technology in many real-world domains. Although there have many learning systems that help automatically learning domain models, most existing work assumes that the input traces are completely correct. A more realistic situation is that the plan traces are disordered and noisy, such as plan traces described by natural language. In this paper we propose and evaluate an approach for doing this. Our approach takes as input a set of plan traces with disordered actions and noise and outputs action models that can best explain the plan traces. We use a MAX-SAT framework for learning, where the constraints are derived from the given plan traces. Unlike traditional action models learners, the states in plan traces can be partially observable and noisy as well as the actions in plan traces can be disordered and parallel. We demonstrate the effectiveness of our approach through a systematic empirical evaluation with both IPC domains and the real-world dataset extracted from natural language documents.
△ Less
Submitted 9 September, 2019; v1 submitted 26 August, 2019;
originally announced August 2019.
-
Representation Learning for Classical Planning from Partially Observed Traces
Authors:
Zhanhao Xiao,
Hai Wan,
Hankui Hankz Zhuo,
Jinxia Lin,
Yanan Liu
Abstract:
Specifying a complete domain model is time-consuming, which has been a bottleneck of AI planning technique application in many real-world scenarios. Most classical domain-model learning approaches output a domain model in the form of the declarative planning language, such as STRIPS or PDDL, and solve new planning instances by invoking an existing planner. However, planning in such a representatio…
▽ More
Specifying a complete domain model is time-consuming, which has been a bottleneck of AI planning technique application in many real-world scenarios. Most classical domain-model learning approaches output a domain model in the form of the declarative planning language, such as STRIPS or PDDL, and solve new planning instances by invoking an existing planner. However, planning in such a representation is sensitive to the accuracy of the learned domain model which probably cannot be used to solve real planning problems. In this paper, to represent domain models in a vectorization representation way, we propose a novel framework based on graph neural network (GNN) integrating model-free learning and model-based planning, called LP-GNN. By embedding propositions and actions in a graph, the latent relationship between them is explored to form a domain-specific heuristics. We evaluate our approach on five classical planning domains, comparing with the classical domain-model learner ARMS. The experimental results show that the domain models learned by our approach are much more effective on solving real planning problems.
△ Less
Submitted 18 July, 2019;
originally announced July 2019.
-
Federated Hierarchical Hybrid Networks for Clickbait Detection
Authors:
Feng Liao,
Hankz Hankui Zhuo,
Xiaoling Huang,
Yu Zhang
Abstract:
Online media outlets adopt clickbait techniques to lure readers to click on articles in a bid to expand their reach and subsequently increase revenue through ad monetization. As the adverse effects of clickbait attract more and more attention, researchers have started to explore machine learning techniques to automatically detect clickbaits. Previous work on clickbait detection assumes that all th…
▽ More
Online media outlets adopt clickbait techniques to lure readers to click on articles in a bid to expand their reach and subsequently increase revenue through ad monetization. As the adverse effects of clickbait attract more and more attention, researchers have started to explore machine learning techniques to automatically detect clickbaits. Previous work on clickbait detection assumes that all the training data is available locally during training. In many real-world applications, however, training data is generally distributedly stored by different parties (e.g., different parties maintain data with different feature spaces), and the parties cannot share their data with each other due to data privacy issues. It is challenging to build models of high-quality federally for detecting clickbaits effectively without data sharing. In this paper, we propose a federated training framework, which is called federated hierarchical hybrid networks, to build clickbait detection models, where the titles and contents are stored by different parties, whose relationships must be exploited for clickbait detection. We empirically demonstrate that our approach is effective by comparing our approach to the state-of-the-art approaches using datasets from social media.
△ Less
Submitted 3 June, 2019;
originally announced June 2019.
-
Federated Deep Reinforcement Learning
Authors:
Hankz Hankui Zhuo,
Wenfeng Feng,
Yufeng Lin,
Qian Xu,
Qiang Yang
Abstract:
In deep reinforcement learning, building policies of high-quality is challenging when the feature space of states is small and the training data is limited. Despite the success of previous transfer learning approaches in deep reinforcement learning, directly transferring data or models from an agent to another agent is often not allowed due to the privacy of data and/or models in many privacy-awar…
▽ More
In deep reinforcement learning, building policies of high-quality is challenging when the feature space of states is small and the training data is limited. Despite the success of previous transfer learning approaches in deep reinforcement learning, directly transferring data or models from an agent to another agent is often not allowed due to the privacy of data and/or models in many privacy-aware applications. In this paper, we propose a novel deep reinforcement learning framework to federatively build models of high-quality for agents with consideration of their privacies, namely Federated deep Reinforcement Learning (FedRL). To protect the privacy of data and models, we exploit Gausian differentials on the information shared with each other when updating their local models. In the experiment, we evaluate our FedRL framework in two diverse domains, Grid-world and Text2Action domains, by comparing to various baselines.
△ Less
Submitted 9 February, 2020; v1 submitted 24 January, 2019;
originally announced January 2019.
-
SCSP: Spectral Clustering Filter Pruning with Soft Self-adaption Manners
Authors:
Huiyuan Zhuo,
Xuelin Qian,
Yanwei Fu,
Heng Yang,
Xiangyang Xue
Abstract:
Deep Convolutional Neural Networks (CNN) has achieved significant success in computer vision field. However, the high computational cost of the deep complex models prevents the deployment on edge devices with limited memory and computational resource. In this paper, we proposed a novel filter pruning for convolutional neural networks compression, namely spectral clustering filter pruning with soft…
▽ More
Deep Convolutional Neural Networks (CNN) has achieved significant success in computer vision field. However, the high computational cost of the deep complex models prevents the deployment on edge devices with limited memory and computational resource. In this paper, we proposed a novel filter pruning for convolutional neural networks compression, namely spectral clustering filter pruning with soft self-adaption manners (SCSP). We first apply spectral clustering on filters layer by layer to explore their intrinsic connections and only count on efficient groups. By self-adaption manners, the pruning operations can be done in few epochs to let the network gradually choose meaningful groups. According to this strategy, we not only achieve model compression while keeping considerable performance, but also find a novel angle to interpret the model compression process.
△ Less
Submitted 13 June, 2018;
originally announced June 2018.
-
An Integrated Development Environment for Planning Domain Modeling
Authors:
Yuncong Li,
Hankz Hankui Zhuo
Abstract:
In order to make the task, description of planning domains and problems, more comprehensive for non-experts in planning, the visual representation has been used in planning domain modeling in recent years. However, current knowledge engineering tools with visual modeling, like itSIMPLE (Vaquero et al. 2012) and VIZ (Vodrážka and Chrpa 2010), are less efficient than the traditional method of hand-c…
▽ More
In order to make the task, description of planning domains and problems, more comprehensive for non-experts in planning, the visual representation has been used in planning domain modeling in recent years. However, current knowledge engineering tools with visual modeling, like itSIMPLE (Vaquero et al. 2012) and VIZ (Vodrážka and Chrpa 2010), are less efficient than the traditional method of hand-coding by a PDDL expert using a text editor, and rarely involved in finetuning planning domains depending on the plan validation. Aim at this, we present an integrated development environment KAVI for planning domain modeling inspired by itSIMPLE and VIZ. KAVI using an abstract domain knowledge base to improve the efficiency of planning domain visual modeling. By integrating planners and a plan validator, KAVI proposes a method to fine-tune planning domains based on the plan validation.
△ Less
Submitted 19 April, 2018;
originally announced April 2018.
-
Extracting Action Sequences from Texts Based on Deep Reinforcement Learning
Authors:
Wenfeng Feng,
Hankz Hankui Zhuo,
Subbarao Kambhampati
Abstract:
Extracting action sequences from natural language texts is challenging, as it requires commonsense inferences based on world knowledge. Although there has been work on extracting action scripts, instructions, navigation actions, etc., they require that either the set of candidate actions be provided in advance, or that action descriptions are restricted to a specific form, e.g., description templa…
▽ More
Extracting action sequences from natural language texts is challenging, as it requires commonsense inferences based on world knowledge. Although there has been work on extracting action scripts, instructions, navigation actions, etc., they require that either the set of candidate actions be provided in advance, or that action descriptions are restricted to a specific form, e.g., description templates. In this paper, we aim to extract action sequences from texts in free natural language, i.e., without any restricted templates, provided the candidate set of actions is unknown. We propose to extract action sequences from texts based on the deep reinforcement learning framework. Specifically, we view "selecting" or "eliminating" words from texts as "actions", and the texts associated with actions as "states". We then build Q-networks to learn the policy of extracting actions and extract plans from the labeled texts. We demonstrate the effectiveness of our approach on several datasets with comparison to state-of-the-art approaches, including online experiments interacting with humans.
△ Less
Submitted 11 May, 2018; v1 submitted 7 March, 2018;
originally announced March 2018.
-
Discovering Underlying Plans Based on Shallow Models
Authors:
Hankz Hankui Zhuo,
Yantian Zha,
Subbarao Kambhampati
Abstract:
Plan recognition aims to discover target plans (i.e., sequences of actions) behind observed actions, with history plan libraries or domain models in hand. Previous approaches either discover plans by maximally "matching" observed actions to plan libraries, assuming target plans are from plan libraries, or infer plans by executing domain models to best explain the observed actions, assuming that co…
▽ More
Plan recognition aims to discover target plans (i.e., sequences of actions) behind observed actions, with history plan libraries or domain models in hand. Previous approaches either discover plans by maximally "matching" observed actions to plan libraries, assuming target plans are from plan libraries, or infer plans by executing domain models to best explain the observed actions, assuming that complete domain models are available. In real world applications, however, target plans are often not from plan libraries, and complete domain models are often not available, since building complete sets of plans and complete domain models are often difficult or expensive. In this paper we view plan libraries as corpora and learn vector representations of actions using the corpora, we then discover target plans based on the vector representations. Specifically, we propose two approaches, DUP and RNNPlanner, to discover target plans based on vector representations of actions. DUP explores the EM-style framework to capture local contexts of actions and discover target plans by optimizing the probability of target plans, while RNNPlanner aims to leverage long-short term contexts of actions based on RNNs (recurrent neural networks) framework to help recognize target plans. In the experiments, we empirically show that our approaches are capable of discovering underlying plans that are not from plan libraries, without requiring domain models provided. We demonstrate the effectiveness of our approaches by comparing its performance to traditional plan recognition approaches in three planning domains. We also compare DUP and RNNPlanner to see their advantages and disadvantages.
△ Less
Submitted 3 March, 2018;
originally announced March 2018.
-
Effective suppression of parametric instabilities with decoupled broadband lasers in plasma
Authors:
Yao Zhao,
Suming Weng,
Min Chen,
Jun Zheng,
Hongbin Zhuo,
Chuang Ren,
Zhengming Sheng,
Jie Zhang
Abstract:
A theoretical analysis for the stimulated Raman scattering (SRS) instability driven by two laser beams with certain frequency difference is presented. It is found that strong coupling and enhanced SRS take place only when the unstable regions corresponding respectively to the two beams are overlapped in the wavenumber space. Hence a threshold of the beam frequency difference for their decoupling i…
▽ More
A theoretical analysis for the stimulated Raman scattering (SRS) instability driven by two laser beams with certain frequency difference is presented. It is found that strong coupling and enhanced SRS take place only when the unstable regions corresponding respectively to the two beams are overlapped in the wavenumber space. Hence a threshold of the beam frequency difference for their decoupling is found as a function of their intensity and plasma density. Based upon this, a strategy to suppress the SRS instability with decoupled broadband lasers (DBLs) is proposed. A DBL can be composed of tens or even hundreds of beamlets, where the beamlets are distributed uniformly in a broad spectrum range such as over 10\% of the central frequency. Decoupling among the beamlets is found due to the limited beamlet energy and suitable frequency difference between neighboring beamlets. Particle-in-cell simulations demonstrate that SRS can be almost completely suppressed with DBLs at the laser intensity $\sim10^{15}$ W/cm$^2$. Moreover, stimulated Brillouin scattering (SBS) will be suppressed simultaneously with DBLs as long as SRS is suppressed. DBLs can be attractive for driving inertial confined fusion.
△ Less
Submitted 23 March, 2022; v1 submitted 12 August, 2017;
originally announced August 2017.
-
Paper2vec: Citation-Context Based Document Distributed Representation for Scholar Recommendation
Authors:
Han Tian,
Hankz Hankui Zhuo
Abstract:
Due to the availability of references of research papers and the rich information contained in papers, various citation analysis approaches have been proposed to identify similar documents for scholar recommendation. Despite of the success of previous approaches, they are, however, based on co-occurrence of items. Once there are no co-occurrence items available in documents, they will not work wel…
▽ More
Due to the availability of references of research papers and the rich information contained in papers, various citation analysis approaches have been proposed to identify similar documents for scholar recommendation. Despite of the success of previous approaches, they are, however, based on co-occurrence of items. Once there are no co-occurrence items available in documents, they will not work well. Inspired by distributed representations of words in the literature of natural language processing, we propose a novel approach to measuring the similarity of papers based on distributed representations learned from the citation context of papers. We view the set of papers as the vocabulary, define the weighted citation context of papers, and convert it to weight matrix similar to the word-word cooccurrence matrix in natural language processing. After that we explore a variant of matrix factorization approach to train distributed representations of papers on the matrix, and leverage the distributed representations to measure similarities of papers. In the experiment, we exhibit that our approach outperforms state-of-theart citation-based approaches by 25%, and better than other distributed representation based methods.
△ Less
Submitted 19 March, 2017;
originally announced March 2017.
-
Distributed-Representation Based Hybrid Recommender System with Short Item Descriptions
Authors:
Junhua He,
Hankz Hankui Zhuo,
Jarvan Law
Abstract:
Collaborative filtering (CF) aims to build a model from users' past behaviors and/or similar decisions made by other users, and use the model to recommend items for users. Despite of the success of previous collaborative filtering approaches, they are all based on the assumption that there are sufficient rating scores available for building high-quality recommendation models. In real world applica…
▽ More
Collaborative filtering (CF) aims to build a model from users' past behaviors and/or similar decisions made by other users, and use the model to recommend items for users. Despite of the success of previous collaborative filtering approaches, they are all based on the assumption that there are sufficient rating scores available for building high-quality recommendation models. In real world applications, however, it is often difficult to collect sufficient rating scores, especially when new items are introduced into the system, which makes the recommendation task challenging. We find that there are often "short" texts describing features of items, based on which we can approximate the similarity of items and make recommendation together with rating scores. In this paper we "borrow" the idea of vector representation of words to capture the information of short texts and embed it into a matrix factorization framework. We empirically show that our approach is effective by comparing it with state-of-the-art approaches.
△ Less
Submitted 14 March, 2017;
originally announced March 2017.
-
Embedding Knowledge Graphs Based on Transitivity and Antisymmetry of Rules
Authors:
Mengya Wang,
Hankui Zhuo,
Huiling Zhu
Abstract:
Representation learning of knowledge graphs encodes entities and relation types into a continuous low-dimensional vector space, learns embeddings of entities and relation types. Most existing methods only concentrate on knowledge triples, ignoring logic rules which contain rich background knowledge. Although there has been some work aiming at leveraging both knowledge triples and logic rules, they…
▽ More
Representation learning of knowledge graphs encodes entities and relation types into a continuous low-dimensional vector space, learns embeddings of entities and relation types. Most existing methods only concentrate on knowledge triples, ignoring logic rules which contain rich background knowledge. Although there has been some work aiming at leveraging both knowledge triples and logic rules, they ignore the transitivity and antisymmetry of logic rules. In this paper, we propose a novel approach to learn knowledge representations with entities and ordered relations in knowledges and logic rules. The key idea is to integrate knowledge triples and logic rules, and approximately order the relation types in logic rules to utilize the transitivity and antisymmetry of logic rules. All entries of the embeddings of relation types are constrained to be non-negative. We translate the general constrained optimization problem into an unconstrained optimization problem to solve the non-negative matrix factorization. Experimental results show that our model significantly outperforms other baselines on knowledge graph completion task. It indicates that our model is capable of capturing the transitivity and antisymmetry information, which is significant when learning embeddings of knowledge graphs.
△ Less
Submitted 19 April, 2017; v1 submitted 24 February, 2017;
originally announced February 2017.
-
LTSG: Latent Topical Skip-Gram for Mutually Learning Topic Model and Vector Representations
Authors:
Jarvan Law,
Hankz Hankui Zhuo,
Junhua He,
Erhu Rong
Abstract:
Topic models have been widely used in discovering latent topics which are shared across documents in text mining. Vector representations, word embeddings and topic embeddings, map words and topics into a low-dimensional and dense real-value vector space, which have obtained high performance in NLP tasks. However, most of the existing models assume the result trained by one of them are perfect corr…
▽ More
Topic models have been widely used in discovering latent topics which are shared across documents in text mining. Vector representations, word embeddings and topic embeddings, map words and topics into a low-dimensional and dense real-value vector space, which have obtained high performance in NLP tasks. However, most of the existing models assume the result trained by one of them are perfect correct and used as prior knowledge for improving the other model. Some other models use the information trained from external large corpus to help improving smaller corpus. In this paper, we aim to build such an algorithm framework that makes topic models and vector representations mutually improve each other within the same corpus. An EM-style algorithm framework is employed to iteratively optimize both topic model and vector representations. Experimental results show that our model outperforms state-of-art methods on various NLP tasks.
△ Less
Submitted 23 February, 2017;
originally announced February 2017.
-
Containing intense laser light in circular cavity with magnetic trap door
Authors:
X. H. Yang,
W. Yu,
M. Y. Yu,
H. Xu,
Y. Y. Ma,
Z. M. Sheng,
H. B. Zhuo,
Z. Y. Ge,
F. Q. Shao
Abstract:
It is shown by particle-in-cell simulation that intense circularly polarized (CP) laser light can be contained in the cavity of a solid-density circular Al-plasma shell for hundreds of light-wave periods before it is dissipated by laser-plasma interaction. A right-hand CP laser pulse can propagate almost without reflection into the cavity through a highly magnetized overdense H-plasma slab filling…
▽ More
It is shown by particle-in-cell simulation that intense circularly polarized (CP) laser light can be contained in the cavity of a solid-density circular Al-plasma shell for hundreds of light-wave periods before it is dissipated by laser-plasma interaction. A right-hand CP laser pulse can propagate almost without reflection into the cavity through a highly magnetized overdense H-plasma slab filling the entrance hole. The entrapped laser light is then multiply reflected at the inner surfaces of the slab and shell plasmas, gradually losing energy to the latter. Compared to that of the incident laser, the frequency is only slightly broadened and the wave vector slightly modified by appearance of weak nearly isotropic and homogeneous higher harmonics.
△ Less
Submitted 11 January, 2017;
originally announced January 2017.
-
Study of filamentation instability on the divergence of ultraintense laser-driven electrons
Authors:
X. H. Yang,
H. B. Zhuo,
H. Xu,
Z. Y. Ge,
F. Q. Shao,
M. Borghesi,
Y. Y. Ma
Abstract:
Generation of relativistic electron (RE) beams during ultraintense laser pulse interaction with plasma targets is studied by collisional particle-in-cell (PIC) simulations. Strong magnetic field with transverse scale length of several local plasma skin depths, associated with RE currents propagation in the target, is generated by filamentation instability (FI) in collisional plasmas, inducing a gr…
▽ More
Generation of relativistic electron (RE) beams during ultraintense laser pulse interaction with plasma targets is studied by collisional particle-in-cell (PIC) simulations. Strong magnetic field with transverse scale length of several local plasma skin depths, associated with RE currents propagation in the target, is generated by filamentation instability (FI) in collisional plasmas, inducing a great enhancement of the divergence of REs compared to that of collisionless cases. Such effect is increased with laser intensity and target charge state, suggesting that the RE divergence might be improved by using low-Z materials under appropriate laser intensities in future fast ignition experiments and in other applications of laser-driven electron beams.
△ Less
Submitted 14 October, 2016;
originally announced October 2016.
-
Plan Explicability and Predictability for Robot Task Planning
Authors:
Yu Zhang,
Sarath Sreedharan,
Anagha Kulkarni,
Tathagata Chakraborti,
Hankz Hankui Zhuo,
Subbarao Kambhampati
Abstract:
Intelligent robots and machines are becoming pervasive in human populated environments. A desirable capability of these agents is to respond to goal-oriented commands by autonomously constructing task plans. However, such autonomy can add significant cognitive load and potentially introduce safety risks to humans when agents behave unexpectedly. Hence, for such agents to be helpful, one important…
▽ More
Intelligent robots and machines are becoming pervasive in human populated environments. A desirable capability of these agents is to respond to goal-oriented commands by autonomously constructing task plans. However, such autonomy can add significant cognitive load and potentially introduce safety risks to humans when agents behave unexpectedly. Hence, for such agents to be helpful, one important requirement is for them to synthesize plans that can be easily understood by humans. While there exists previous work that studied socially acceptable robots that interact with humans in "natural ways", and work that investigated legible motion planning, there lacks a general solution for high level task planning. To address this issue, we introduce the notions of plan {\it explicability} and {\it predictability}. To compute these measures, first, we postulate that humans understand agent plans by associating abstract tasks with agent actions, which can be considered as a labeling process. We learn the labeling scheme of humans for agent plans from training examples using conditional random fields (CRFs). Then, we use the learned model to label a new plan to compute its explicability and predictability. These measures can be used by agents to proactively choose or directly synthesize plans that are more explicable and predictable to humans. We provide evaluations on a synthetic domain and with human subjects using physical robots to show the effectiveness of our approach
△ Less
Submitted 12 April, 2016; v1 submitted 25 November, 2015;
originally announced November 2015.
-
Discovering Underlying Plans Based on Distributed Representations of Actions
Authors:
Xin Tian,
Hankz Hankui Zhuo,
Subbarao Kambhampati
Abstract:
Plan recognition aims to discover target plans (i.e., sequences of actions) behind observed actions, with history plan libraries or domain models in hand. Previous approaches either discover plans by maximally "matching" observed actions to plan libraries, assuming target plans are from plan libraries, or infer plans by executing domain models to best explain the observed actions, assuming complet…
▽ More
Plan recognition aims to discover target plans (i.e., sequences of actions) behind observed actions, with history plan libraries or domain models in hand. Previous approaches either discover plans by maximally "matching" observed actions to plan libraries, assuming target plans are from plan libraries, or infer plans by executing domain models to best explain the observed actions, assuming complete domain models are available. In real world applications, however, target plans are often not from plan libraries and complete domain models are often not available, since building complete sets of plans and complete domain models are often difficult or expensive. In this paper we view plan libraries as corpora and learn vector representations of actions using the corpora; we then discover target plans based on the vector representations. Our approach is capable of discovering underlying plans that are not from plan libraries, without requiring domain models provided. We empirically demonstrate the effectiveness of our approach by comparing its performance to traditional plan recognition approaches in three planning domains.
△ Less
Submitted 18 November, 2015;
originally announced November 2015.
-
$D_{sJ}(2860)$ From The Semileptonic Decays Of $B_s$ Mesons
Authors:
Long-Fei Gan,
Jian-Rong Zhang,
Ming-Qiu Huang,
Hong-Bin Zhuo,
Yan-Yun Ma,
Qing-Jun Zhu,
Jian-Xun Liu,
Guo-Bo Zhang
Abstract:
In the framework of heavy quark effective theory, the leading order Isgur-Wise form factors relevant to semileptonic decays of the ground state $\bar{b}s$ meson $B_{s}$ into orbitally excited $D$-wave $\bar{c}s$ mesons, including the newly observed narrow $D^{*}_{s1}(2860)$ and $D^{*}_{s3}(2860)$ states by the LHCb Collaboration, are calculated with the QCD sum rule method. With these universal fo…
▽ More
In the framework of heavy quark effective theory, the leading order Isgur-Wise form factors relevant to semileptonic decays of the ground state $\bar{b}s$ meson $B_{s}$ into orbitally excited $D$-wave $\bar{c}s$ mesons, including the newly observed narrow $D^{*}_{s1}(2860)$ and $D^{*}_{s3}(2860)$ states by the LHCb Collaboration, are calculated with the QCD sum rule method. With these universal form factors, the decay rates and branching ratios are estimated. We find that the decay widths are $Γ(B_s\rightarrow D^{*}_{s1}\ell\barν) =1.25^{+0.80}_{-0.60}\times10^{-19} \mbox{GeV}$, $Γ(B_s\rightarrow D^{'}_{s2}\ell\barν) =1.49^{+0.97}_{-0.73}\times10^{-19} \mbox{GeV}$, $Γ(B_s\rightarrow D_{s2}\ell\barν) =4.48^{+1.05}_{-0.94}\times10^{-17} \mbox{GeV}$, and $Γ(B_s\rightarrow D^{*}_{s3}\ell\barν) = 1.52^{+0.35}_{-0.31}\times10^{-16} \mbox{GeV}$. The corresponding branching ratios are $\mathcal {B}(B_s\rightarrow D^{*}_{s1}\ell\barν) =2.85^{+1.82}_{-1.36}\times 10^{-7}$, $\mathcal {B}(B_s\rightarrow D^{'}_{s2}\ell\barν) =3.40^{+2.21}_{-1.66}\times 10^{-7}$, $\mathcal {B}(B_{s}\rightarrow D_{s2}\ell\barν) =1.02^{+0.24}_{-0.21}\times 10^{-4}$, and $\mathcal {B}(B_s\rightarrow D^{*}_{s3}\ell\barν) = 3.46^{+0.80}_{-0.70}\times 10^{-4}$. The decay widths and branching ratios of corresponding $B^{*}_{s}$ semileptonic processes are also predicted.
△ Less
Submitted 3 April, 2015; v1 submitted 26 December, 2014;
originally announced December 2014.