-
An Efficient Dynamic Resource Allocation Framework for Evolutionary Bilevel Optimization
Authors:
Dejun Xu,
Kai Ye,
Zimo Zheng,
Tao Zhou,
Gary G. Yen,
Min Jiang
Abstract:
Bilevel optimization problems are characterized by an interactive hierarchical structure, where the upper level seeks to optimize its strategy while simultaneously considering the response of the lower level. Evolutionary algorithms are commonly used to solve complex bilevel problems in practical scenarios, but they face significant resource consumption challenges due to the nested structure impos…
▽ More
Bilevel optimization problems are characterized by an interactive hierarchical structure, where the upper level seeks to optimize its strategy while simultaneously considering the response of the lower level. Evolutionary algorithms are commonly used to solve complex bilevel problems in practical scenarios, but they face significant resource consumption challenges due to the nested structure imposed by the implicit lower-level optimality condition. This challenge becomes even more pronounced as problem dimensions increase. Although recent methods have enhanced bilevel convergence through task-level knowledge sharing, further efficiency improvements are still hindered by redundant lower-level iterations that consume excessive resources while generating unpromising solutions. To overcome this challenge, this paper proposes an efficient dynamic resource allocation framework for evolutionary bilevel optimization, named DRC-BLEA. Compared to existing approaches, DRC-BLEA introduces a novel competitive quasi-parallel paradigm, in which multiple lower-level optimization tasks, derived from different upper-level individuals, compete for resources. A continuously updated selection probability is used to prioritize execution opportunities to promising tasks. Additionally, a cooperation mechanism is integrated within the competitive framework to further enhance efficiency and prevent premature convergence. Experimental results compared with chosen state-of-the-art algorithms demonstrate the effectiveness of the proposed method. Specifically, DRC-BLEA achieves competitive accuracy across diverse problem sets and real-world scenarios, while significantly reducing the number of function evaluations and overall running time.
△ Less
Submitted 6 November, 2024; v1 submitted 31 October, 2024;
originally announced October 2024.
-
HMAMP: Hypervolume-Driven Multi-Objective Antimicrobial Peptides Design
Authors:
Li Wang,
Yiping Li,
Xiangzheng Fu,
Xiucai Ye,
Junfeng Shi,
Gary G. Yen,
Xiangxiang Zeng
Abstract:
Antimicrobial peptides (AMPs) have exhibited unprecedented potential as biomaterials in combating multidrug-resistant bacteria. Despite the increasing adoption of artificial intelligence for novel AMP design, challenges pertaining to conflicting attributes such as activity, hemolysis, and toxicity have significantly impeded the progress of researchers. This paper introduces a paradigm shift by con…
▽ More
Antimicrobial peptides (AMPs) have exhibited unprecedented potential as biomaterials in combating multidrug-resistant bacteria. Despite the increasing adoption of artificial intelligence for novel AMP design, challenges pertaining to conflicting attributes such as activity, hemolysis, and toxicity have significantly impeded the progress of researchers. This paper introduces a paradigm shift by considering multiple attributes in AMP design.
Presented herein is a novel approach termed Hypervolume-driven Multi-objective Antimicrobial Peptide Design (HMAMP), which prioritizes the simultaneous optimization of multiple attributes of AMPs. By synergizing reinforcement learning and a gradient descent algorithm rooted in the hypervolume maximization concept, HMAMP effectively expands exploration space and mitigates the issue of pattern collapse. This method generates a wide array of prospective AMP candidates that strike a balance among diverse attributes. Furthermore, we pinpoint knee points along the Pareto front of these candidate AMPs. Empirical results across five benchmark models substantiate that HMAMP-designed AMPs exhibit competitive performance and heightened diversity. A detailed analysis of the helical structures and molecular dynamics simulations for ten potential candidate AMPs validates the superiority of HMAMP in the realm of multi-objective AMP design. The ability of HMAMP to systematically craft AMPs considering multiple attributes marks a pioneering milestone, establishing a universal computational framework for the multi-objective design of AMPs.
△ Less
Submitted 1 May, 2024;
originally announced May 2024.
-
A Composite Decomposition Method for Large-Scale Global Optimization
Authors:
Maojiang Tian,
Minyang Chen,
Wei Du,
Yang Tang,
Yaochu Jin,
Gary G. Yen
Abstract:
Cooperative co-evolution (CC) algorithms, based on the divide-and-conquer strategy, have emerged as the predominant approach to solving large-scale global optimization (LSGO) problems. The efficiency and accuracy of the grouping stage significantly impact the performance of the optimization process. While the general separability grouping (GSG) method has overcome the limitation of previous differ…
▽ More
Cooperative co-evolution (CC) algorithms, based on the divide-and-conquer strategy, have emerged as the predominant approach to solving large-scale global optimization (LSGO) problems. The efficiency and accuracy of the grouping stage significantly impact the performance of the optimization process. While the general separability grouping (GSG) method has overcome the limitation of previous differential grouping (DG) methods by enabling the decomposition of non-additively separable functions, it suffers from high computational complexity. To address this challenge, this article proposes a composite separability grouping (CSG) method, seamlessly integrating DG and GSG into a problem decomposition framework to utilize the strengths of both approaches. CSG introduces a step-by-step decomposition framework that accurately decomposes various problem types using fewer computational resources. By sequentially identifying additively, multiplicatively and generally separable variables, CSG progressively groups non-separable variables by recursively considering the interactions between each non-separable variable and the formed non-separable groups. Furthermore, to enhance the efficiency and accuracy of CSG, we introduce two innovative methods: a multiplicatively separable variable detection method and a non-separable variable grouping method. These two methods are designed to effectively detect multiplicatively separable variables and efficiently group non-separable variables, respectively. Extensive experimental results demonstrate that CSG achieves more accurate variable grouping with lower computational complexity compared to GSG and state-of-the-art DG series designs.
△ Less
Submitted 8 March, 2024; v1 submitted 2 March, 2024;
originally announced March 2024.
-
Diffusion Model-Based Multiobjective Optimization for Gasoline Blending Scheduling
Authors:
Wenxuan Fang,
Wei Du,
Renchu He,
Yang Tang,
Yaochu Jin,
Gary G. Yen
Abstract:
Gasoline blending scheduling uses resource allocation and operation sequencing to meet a refinery's production requirements. The presence of nonlinearity, integer constraints, and a large number of decision variables adds complexity to this problem, posing challenges for traditional and evolutionary algorithms. This paper introduces a novel multiobjective optimization approach driven by a diffusio…
▽ More
Gasoline blending scheduling uses resource allocation and operation sequencing to meet a refinery's production requirements. The presence of nonlinearity, integer constraints, and a large number of decision variables adds complexity to this problem, posing challenges for traditional and evolutionary algorithms. This paper introduces a novel multiobjective optimization approach driven by a diffusion model (named DMO), which is designed specifically for gasoline blending scheduling. To address integer constraints and generate feasible schedules, the diffusion model creates multiple intermediate distributions between Gaussian noise and the feasible domain. Through iterative processes, the solutions transition from Gaussian noise to feasible schedules while optimizing the objectives using the gradient descent method. DMO achieves simultaneous objective optimization and constraint adherence. Comparative tests are conducted to evaluate DMO's performance across various scales. The experimental results demonstrate that DMO surpasses state-of-the-art multiobjective evolutionary algorithms in terms of efficiency when solving gasoline blending scheduling problems.
△ Less
Submitted 4 February, 2024;
originally announced February 2024.
-
Improving Performance Insensitivity of Large-scale Multiobjective Optimization via Monte Carlo Tree Search
Authors:
Haokai Hong,
Min Jiang,
Gary G. Yen
Abstract:
The large-scale multiobjective optimization problem (LSMOP) is characterized by simultaneously optimizing multiple conflicting objectives and involving hundreds of decision variables. Many real-world applications in engineering fields can be modeled as LSMOPs; simultaneously, engineering applications require insensitivity in performance. This requirement usually means that the results from the alg…
▽ More
The large-scale multiobjective optimization problem (LSMOP) is characterized by simultaneously optimizing multiple conflicting objectives and involving hundreds of decision variables. Many real-world applications in engineering fields can be modeled as LSMOPs; simultaneously, engineering applications require insensitivity in performance. This requirement usually means that the results from the algorithm runs should not only be good for every run in terms of performance but also that the performance of multiple runs should not fluctuate too much, i.e., the algorithm shows good insensitivity. Considering that substantial computational resources are requested for each run, it is essential to improve upon the performance of the large-scale multiobjective optimization algorithm, as well as the insensitivity of the algorithm. However, existing large-scale multiobjective optimization algorithms solely focus on improving the performance of the algorithms, leaving the insensitivity characteristics unattended. In this work, we propose an evolutionary algorithm for solving LSMOPs based on Monte Carlo tree search, the so-called LMMOCTS, which aims to improve the performance and insensitivity for large-scale multiobjective optimization problems. The proposed method samples the decision variables to construct new nodes on the Monte Carlo tree for optimization and evaluation. It selects nodes with good evaluation for further search to reduce the performance sensitivity caused by large-scale decision variables. We compare the proposed algorithm with several state-of-the-art designs on different benchmark functions. We also propose two metrics to measure the sensitivity of the algorithm. The experimental results confirm the effectiveness and performance insensitivity of the proposed design for solving large-scale multiobjective optimization problems.
△ Less
Submitted 14 April, 2023; v1 submitted 8 April, 2023;
originally announced April 2023.
-
Efficient Evaluation Methods for Neural Architecture Search: A Survey
Authors:
Xiaotian Song,
Xiangning Xie,
Zeqiong Lv,
Gary G. Yen,
Weiping Ding,
Jiancheng Lv,
Yanan Sun
Abstract:
Neural Architecture Search (NAS) has received increasing attention because of its exceptional merits in automating the design of Deep Neural Network (DNN) architectures. However, the performance evaluation process, as a key part of NAS, often requires training a large number of DNNs. This inevitably makes NAS computationally expensive. In past years, many Efficient Evaluation Methods (EEMs) have b…
▽ More
Neural Architecture Search (NAS) has received increasing attention because of its exceptional merits in automating the design of Deep Neural Network (DNN) architectures. However, the performance evaluation process, as a key part of NAS, often requires training a large number of DNNs. This inevitably makes NAS computationally expensive. In past years, many Efficient Evaluation Methods (EEMs) have been proposed to address this critical issue. In this paper, we comprehensively survey these EEMs published up to date, and provide a detailed analysis to motivate the further development of this research direction. Specifically, we divide the existing EEMs into four categories based on the number of DNNs trained for constructing these EEMs. The categorization can reflect the degree of efficiency in principle, which can in turn help quickly grasp the methodological features. In surveying each category, we further discuss the design principles and analyze the strengths and weaknesses to clarify the landscape of existing EEMs, thus making easily understanding the research trends of EEMs. Furthermore, we also discuss the current challenges and issues to identify future research directions in this emerging topic. In summary, this survey provides a convenient overview of EEM for interested users, and they can easily select the proper EEM method for the tasks at hand. In addition, the researchers in the NAS field could continue exploring the future directions suggested in the paper.
△ Less
Submitted 8 October, 2024; v1 submitted 14 January, 2023;
originally announced January 2023.
-
Analyzing the Expected Hitting Time of Evolutionary Computation-based Neural Architecture Search Algorithms
Authors:
Zeqiong Lv,
Chao Qian,
Gary G. Yen,
Yanan Sun
Abstract:
Evolutionary computation-based neural architecture search (ENAS) is a popular technique for automating architecture design of deep neural networks. Despite its groundbreaking applications, there is no theoretical study for ENAS. The expected hitting time (EHT) is one of the most important theoretical issues, since it implies the average computational time complexity. This paper proposes a general…
▽ More
Evolutionary computation-based neural architecture search (ENAS) is a popular technique for automating architecture design of deep neural networks. Despite its groundbreaking applications, there is no theoretical study for ENAS. The expected hitting time (EHT) is one of the most important theoretical issues, since it implies the average computational time complexity. This paper proposes a general method by integrating theory and experiment for estimating the EHT of ENAS algorithms, which includes common configuration, search space partition, transition probability estimation, population distribution fitting, and hitting time analysis. By exploiting the proposed method, we consider the ($λ$+$λ$)-ENAS algorithms with different mutation operators and estimate the lower bounds of the EHT. Furthermore, we study the EHT on the NAS-Bench-101 problem, and the results demonstrate the validity of the proposed method. To the best of our knowledge, this work is the first attempt to establish a theoretical foundation for ENAS algorithms.
△ Less
Submitted 15 March, 2024; v1 submitted 11 October, 2022;
originally announced October 2022.
-
Learn to Adapt for Monocular Depth Estimation
Authors:
Qiyu Sun,
Gary G. Yen,
Yang Tang,
Chaoqiang Zhao
Abstract:
Monocular depth estimation is one of the fundamental tasks in environmental perception and has achieved tremendous progress in virtue of deep learning. However, the performance of trained models tends to degrade or deteriorate when employed on other new datasets due to the gap between different datasets. Though some methods utilize domain adaptation technologies to jointly train different domains…
▽ More
Monocular depth estimation is one of the fundamental tasks in environmental perception and has achieved tremendous progress in virtue of deep learning. However, the performance of trained models tends to degrade or deteriorate when employed on other new datasets due to the gap between different datasets. Though some methods utilize domain adaptation technologies to jointly train different domains and narrow the gap between them, the trained models cannot generalize to new domains that are not involved in training. To boost the transferability of depth estimation models, we propose an adversarial depth estimation task and train the model in the pipeline of meta-learning. Our proposed adversarial task mitigates the issue of meta-overfitting, since the network is trained in an adversarial manner and aims to extract domain invariant representations. In addition, we propose a constraint to impose upon cross-task depth consistency to compel the depth estimation to be identical in different adversarial tasks, which improves the performance of our method and smoothens the training process. Experiments demonstrate that our method adapts well to new datasets after few training steps during the test procedure.
△ Less
Submitted 26 March, 2022;
originally announced March 2022.
-
BenchENAS: A Benchmarking Platform for Evolutionary Neural Architecture Search
Authors:
Xiangning Xie,
Yuqiao Liu,
Yanan Sun,
Gary G. Yen,
Bing Xue,
Mengjie Zhang
Abstract:
Neural architecture search (NAS), which automatically designs the architectures of deep neural networks, has achieved breakthrough success over many applications in the past few years. Among different classes of NAS methods, evolutionary computation based NAS (ENAS) methods have recently gained much attention. Unfortunately, the issues of fair comparisons and efficient evaluations have hindered th…
▽ More
Neural architecture search (NAS), which automatically designs the architectures of deep neural networks, has achieved breakthrough success over many applications in the past few years. Among different classes of NAS methods, evolutionary computation based NAS (ENAS) methods have recently gained much attention. Unfortunately, the issues of fair comparisons and efficient evaluations have hindered the development of ENAS. The current benchmark architecture datasets designed for fair comparisons only provide the datasets, not the ENAS algorithms or the platform to run the algorithms. The existing efficient evaluation methods are either not suitable for the population-based ENAS algorithm or are too complex to use. This paper develops a platform named BenchENAS to address these issues. BenchENAS aims to achieve fair comparisons by running different algorithms in the same environment and with the same settings. To achieve efficient evaluation in a common lab environment, BenchENAS designs a parallel component and a cache component with high maintainability. Furthermore, BenchENAS is easy to install and highly configurable and modular, which brings benefits in good usability and easy extensibility. The paper conducts efficient comparison experiments on eight ENAS algorithms with high GPU utilization on this platform. The experiments validate that the fair comparison issue does exist, and BenchENAS can alleviate this issue. A website has been built to promote BenchENAS at https://benchenas.com, where interested researchers can obtain the source code and document of BenchENAS for free.
△ Less
Submitted 14 August, 2021; v1 submitted 9 August, 2021;
originally announced August 2021.
-
Snippet Policy Network for Multi-class Varied-length ECG Early Classification
Authors:
Yu Huang,
Gary G. Yen,
Vincent S. Tseng
Abstract:
Arrhythmia detection from ECG is an important research subject in the prevention and diagnosis of cardiovascular diseases. The prevailing studies formulate arrhythmia detection from ECG as a time series classification problem. Meanwhile, early detection of arrhythmia presents a real-world demand for early prevention and diagnosis. In this paper, we address a problem of cardiovascular disease early…
▽ More
Arrhythmia detection from ECG is an important research subject in the prevention and diagnosis of cardiovascular diseases. The prevailing studies formulate arrhythmia detection from ECG as a time series classification problem. Meanwhile, early detection of arrhythmia presents a real-world demand for early prevention and diagnosis. In this paper, we address a problem of cardiovascular disease early classification, which is a varied-length and long-length time series early classification problem as well. For solving this problem, we propose a deep reinforcement learning-based framework, namely Snippet Policy Network (SPN), consisting of four modules, snippet generator, backbone network, controlling agent, and discriminator. Comparing to the existing approaches, the proposed framework features flexible input length, solves the dual-optimization solution of the earliness and accuracy goals. Experimental results demonstrate that SPN achieves an excellent performance of over 80\% in terms of accuracy. Compared to the state-of-the-art methods, at least 7% improvement on different metrics, including the precision, recall, F1-score, and harmonic mean, is delivered by the proposed SPN. To the best of our knowledge, this is the first work focusing on solving the cardiovascular early classification problem based on varied-length ECG data. Based on these excellent features from SPN, it offers a good exemplification for addressing all kinds of varied-length time series early classification problems.
△ Less
Submitted 28 July, 2021;
originally announced July 2021.
-
An Online Prediction Approach Based on Incremental Support Vector Machine for Dynamic Multiobjective Optimization
Authors:
Dejun Xu,
Min Jiang,
Weizhen Hu,
Shaozi Li,
Renhu Pan,
Gary G. Yen
Abstract:
Real-world multiobjective optimization problems usually involve conflicting objectives that change over time, which requires the optimization algorithms to quickly track the Pareto optimal front (POF) when the environment changes. In recent years, evolutionary algorithms based on prediction models have been considered promising. However, most existing approaches only make predictions based on the…
▽ More
Real-world multiobjective optimization problems usually involve conflicting objectives that change over time, which requires the optimization algorithms to quickly track the Pareto optimal front (POF) when the environment changes. In recent years, evolutionary algorithms based on prediction models have been considered promising. However, most existing approaches only make predictions based on the linear correlation between a finite number of optimal solutions in two or three previous environments. These incomplete information extraction strategies may lead to low prediction accuracy in some instances. In this paper, a novel prediction algorithm based on incremental support vector machine (ISVM) is proposed, called ISVM-DMOEA. We treat the solving of dynamic multiobjective optimization problems (DMOPs) as an online learning process, using the continuously obtained optimal solution to update an incremental support vector machine without discarding the solution information at earlier time. ISVM is then used to filter random solutions and generate an initial population for the next moment. To overcome the obstacle of insufficient training samples, a synthetic minority oversampling strategy is implemented before the training of ISVM. The advantage of this approach is that the nonlinear correlation between solutions can be explored online by ISVM, and the information contained in all historical optimal solutions can be exploited to a greater extent. The experimental results and comparison with chosen state-of-the-art algorithms demonstrate that the proposed algorithm can effectively tackle dynamic multiobjective optimization problems.
△ Less
Submitted 24 February, 2021;
originally announced February 2021.
-
System Design and Analysis for Energy-Efficient Passive UAV Radar Imaging System using Illuminators of Opportunity
Authors:
Zhichao Sun,
Junjie Wu,
Gary G. Yen,
Hang Ren,
Hongyang An,
Jianyu Yang
Abstract:
Unmanned aerial vehicle (UAV) can provide superior flexibility and cost-efficiency for modern radar imaging systems, which is an ideal platform for advanced remote sensing applications using synthetic aperture radar (SAR) technology. In this paper, an energy-efficient passive UAV radar imaging system using illuminators of opportunity is first proposed and investigated. Equipped with a SAR receiver…
▽ More
Unmanned aerial vehicle (UAV) can provide superior flexibility and cost-efficiency for modern radar imaging systems, which is an ideal platform for advanced remote sensing applications using synthetic aperture radar (SAR) technology. In this paper, an energy-efficient passive UAV radar imaging system using illuminators of opportunity is first proposed and investigated. Equipped with a SAR receiver, the UAV platform passively reuses the backscattered signal of the target scene from an external illuminator, such as SAR satellite, GNSS or ground-based stationary commercial illuminators, and achieves bi-static SAR imaging and data communication. The system can provide instant accessibility to the radar image of the interested targets with enhanced platform concealment, which is an essential tool for stealth observation and scene monitoring. The mission concept and system block diagram are first presented with justifications on the advantages of the system. Then, a set of mission performance evaluators is established to quantitatively assess the capability of the system in a comprehensive manner, including UAV navigation, passive SAR imaging and communication. Finally, the validity of the proposed performance evaluators are verified by numerical simulations.
△ Less
Submitted 8 May, 2021; v1 submitted 30 September, 2020;
originally announced October 2020.
-
A Survey on Evolutionary Neural Architecture Search
Authors:
Yuqiao Liu,
Yanan Sun,
Bing Xue,
Mengjie Zhang,
Gary G. Yen,
Kay Chen Tan
Abstract:
Deep Neural Networks (DNNs) have achieved great success in many applications. The architectures of DNNs play a crucial role in their performance, which is usually manually designed with rich expertise. However, such a design process is labour intensive because of the trial-and-error process, and also not easy to realize due to the rare expertise in practice. Neural Architecture Search (NAS) is a t…
▽ More
Deep Neural Networks (DNNs) have achieved great success in many applications. The architectures of DNNs play a crucial role in their performance, which is usually manually designed with rich expertise. However, such a design process is labour intensive because of the trial-and-error process, and also not easy to realize due to the rare expertise in practice. Neural Architecture Search (NAS) is a type of technology that can design the architectures automatically. Among different methods to realize NAS, Evolutionary Computation (EC) methods have recently gained much attention and success. Unfortunately, there has not yet been a comprehensive summary of the EC-based NAS algorithms. This paper reviews over 200 papers of most recent EC-based NAS methods in light of the core components, to systematically discuss their design principles as well as justifications on the design. Furthermore, current challenges and issues are also discussed to identify future research in this emerging field.
△ Less
Submitted 3 February, 2022; v1 submitted 25 August, 2020;
originally announced August 2020.
-
Masked GANs for Unsupervised Depth and Pose Prediction with Scale Consistency
Authors:
Chaoqiang Zhao,
Gary G. Yen,
Qiyu Sun,
Chongzhen Zhang,
Yang Tang
Abstract:
Previous work has shown that adversarial learning can be used for unsupervised monocular depth and visual odometry (VO) estimation, in which the adversarial loss and the geometric image reconstruction loss are utilized as the mainly supervisory signals to train the whole unsupervised framework. However, the performance of the adversarial framework and image reconstruction is usually limited by occ…
▽ More
Previous work has shown that adversarial learning can be used for unsupervised monocular depth and visual odometry (VO) estimation, in which the adversarial loss and the geometric image reconstruction loss are utilized as the mainly supervisory signals to train the whole unsupervised framework. However, the performance of the adversarial framework and image reconstruction is usually limited by occlusions and the visual field changes between frames. This paper proposes a masked generative adversarial network (GAN) for unsupervised monocular depth and ego-motion estimation.The MaskNet and Boolean mask scheme are designed in this framework to eliminate the effects of occlusions and impacts of visual field changes on the reconstruction loss and adversarial loss, respectively. Furthermore, we also consider the scale consistency of our pose network by utilizing a new scale-consistency loss, and therefore, our pose network is capable of providing the full camera trajectory over a long monocular sequence. Extensive experiments on the KITTI dataset show that each component proposed in this paper contributes to the performance, and both our depth and trajectory predictions achieve competitive performance on the KITTI and Make3D datasets.
△ Less
Submitted 13 April, 2021; v1 submitted 8 April, 2020;
originally announced April 2020.
-
When Autonomous Systems Meet Accuracy and Transferability through AI: A Survey
Authors:
Chongzhen Zhang,
Jianrui Wang,
Gary G. Yen,
Chaoqiang Zhao,
Qiyu Sun,
Yang Tang,
Feng Qian,
Jürgen Kurths
Abstract:
With widespread applications of artificial intelligence (AI), the capabilities of the perception, understanding, decision-making and control for autonomous systems have improved significantly in the past years. When autonomous systems consider the performance of accuracy and transferability, several AI methods, like adversarial learning, reinforcement learning (RL) and meta-learning, show their po…
▽ More
With widespread applications of artificial intelligence (AI), the capabilities of the perception, understanding, decision-making and control for autonomous systems have improved significantly in the past years. When autonomous systems consider the performance of accuracy and transferability, several AI methods, like adversarial learning, reinforcement learning (RL) and meta-learning, show their powerful performance. Here, we review the learning-based approaches in autonomous systems from the perspectives of accuracy and transferability. Accuracy means that a well-trained model shows good results during the testing phase, in which the testing set shares a same task or a data distribution with the training set. Transferability means that when a well-trained model is transferred to other testing domains, the accuracy is still good. Firstly, we introduce some basic concepts of transfer learning and then present some preliminaries of adversarial learning, RL and meta-learning. Secondly, we focus on reviewing the accuracy or transferability or both of them to show the advantages of adversarial learning, like generative adversarial networks (GANs), in typical computer vision tasks in autonomous systems, including image style transfer, image superresolution, image deblurring/dehazing/rain removal, semantic segmentation, depth estimation, pedestrian detection and person re-identification (re-ID). Then, we further review the performance of RL and meta-learning from the aspects of accuracy or transferability or both of them in autonomous systems, involving pedestrian tracking, robot navigation and robotic manipulation. Finally, we discuss several challenges and future topics for using adversarial learning, RL and meta-learning in autonomous systems.
△ Less
Submitted 24 May, 2020; v1 submitted 29 March, 2020;
originally announced March 2020.
-
ArcText: A Unified Text Approach to Describing Convolutional Neural Network Architectures
Authors:
Yanan Sun,
Ziyao Ren,
Gary G. Yen,
Bing Xue,
Mengjie Zhang,
Jiancheng Lv
Abstract:
The superiority of Convolutional Neural Networks (CNNs) largely relies on their architectures that are often manually crafted with extensive human expertise. Unfortunately, such kind of domain knowledge is not necessarily owned by each of the users interested. Data mining on existing CNN can discover useful patterns and fundamental sub-comments from their architectures, providing researchers with…
▽ More
The superiority of Convolutional Neural Networks (CNNs) largely relies on their architectures that are often manually crafted with extensive human expertise. Unfortunately, such kind of domain knowledge is not necessarily owned by each of the users interested. Data mining on existing CNN can discover useful patterns and fundamental sub-comments from their architectures, providing researchers with strong prior knowledge to design proper CNN architectures when they have no expertise in CNNs. There have been various state-of-the-art data mining algorithms at hand, while there is only rare work that has been done for the mining. One of the main reasons is the gap between CNN architectures and data mining algorithms. Specifically, the current CNN architecture descriptions cannot be exactly vectorized to the input of data mining algorithms. In this paper, we propose a unified approach, named ArcText, to describing CNN architectures based on text. Particularly, four different units and an ordering method have been elaborately designed in ArcText, to uniquely describe the same architecture with sufficient information. Also, the resulted description can be exactly converted back to the corresponding CNN architecture. ArcText bridges the gap between CNN architectures and data mining researchers, and has the potentiality to be utilized to wider scenarios.
△ Less
Submitted 29 May, 2020; v1 submitted 16 February, 2020;
originally announced February 2020.
-
Pruning Deep Convolutional Neural Networks Architectures with Evolution Strategy
Authors:
Francisco Erivaldo Fernandes Junior,
Gary G. Yen
Abstract:
Currently, Deep Convolutional Neural Networks (DCNNs) are used to solve all kinds of problems in the field of machine learning and artificial intelligence due to their learning and adaptation capabilities. However, most successful DCNN models have a high computational complexity making them difficult to deploy on mobile or embedded platforms. This problem has prompted many researchers to develop a…
▽ More
Currently, Deep Convolutional Neural Networks (DCNNs) are used to solve all kinds of problems in the field of machine learning and artificial intelligence due to their learning and adaptation capabilities. However, most successful DCNN models have a high computational complexity making them difficult to deploy on mobile or embedded platforms. This problem has prompted many researchers to develop algorithms and approaches to help reduce the computational complexity of such models. One of them is called filter pruning, where convolution filters are eliminated to reduce the number of parameters and, consequently, the computational complexity of the given model. In the present work, we propose a novel algorithm to perform filter pruning by using Multi-Objective Evolution Strategy (ES) algorithm, called DeepPruningES. Our approach avoids the need for using any knowledge during the pruning procedure and helps decision-makers by returning three pruned CNN models with different trade-offs between performance and computational complexity. We show that DeepPruningES can significantly reduce a model's computational complexity by testing it on three DCNN architectures: Convolutional Neural Networks (CNNs), Residual Neural Networks (ResNets), and Densely Connected Neural Networks (DenseNets).
△ Less
Submitted 30 November, 2020; v1 submitted 24 December, 2019;
originally announced December 2019.
-
Automatically Evolving CNN Architectures Based on Blocks
Authors:
Yanan Sun,
Bing Xue,
Mengjie Zhang,
Gary G. Yen
Abstract:
The performance of Convolutional Neural Networks (CNNs) highly relies on their architectures. In order to design a CNN with promising performance, extended expertise in both CNNs and the investigated problem is required, which is not necessarily held by every user interested in CNNs or the problem domain. In this paper, we propose to automatically evolve CNN architectures by using a genetic algori…
▽ More
The performance of Convolutional Neural Networks (CNNs) highly relies on their architectures. In order to design a CNN with promising performance, extended expertise in both CNNs and the investigated problem is required, which is not necessarily held by every user interested in CNNs or the problem domain. In this paper, we propose to automatically evolve CNN architectures by using a genetic algorithm based on ResNet blocks and DenseNet blocks. The proposed algorithm is \textbf{completely} automatic in designing CNN architectures, particularly, neither pre-processing before it starts nor post-processing on the designed CNN is needed. Furthermore, the proposed algorithm does not require users with domain knowledge on CNNs, the investigated problem or even genetic algorithms. The proposed algorithm is evaluated on CIFAR10 and CIFAR100 against 18 state-of-the-art peer competitors. Experimental results show that it outperforms state-of-the-art CNNs hand-crafted and CNNs designed by automatic peer competitors in terms of the classification accuracy, and achieves the competitive classification accuracy against semi-automatic peer competitors. In addition, the proposed algorithm consumes much less time than most peer competitors in finding the best CNN architectures.
△ Less
Submitted 10 March, 2019; v1 submitted 28 October, 2018;
originally announced October 2018.
-
Automatically designing CNN architectures using genetic algorithm for image classification
Authors:
Yanan Sun,
Bing Xue,
Mengjie Zhang,
Gary G. Yen
Abstract:
Convolutional Neural Networks (CNNs) have gained a remarkable success on many image classification tasks in recent years. However, the performance of CNNs highly relies upon their architectures. For most state-of-the-art CNNs, their architectures are often manually-designed with expertise in both CNNs and the investigated problems. Therefore, it is difficult for users, who have no extended experti…
▽ More
Convolutional Neural Networks (CNNs) have gained a remarkable success on many image classification tasks in recent years. However, the performance of CNNs highly relies upon their architectures. For most state-of-the-art CNNs, their architectures are often manually-designed with expertise in both CNNs and the investigated problems. Therefore, it is difficult for users, who have no extended expertise in CNNs, to design optimal CNN architectures for their own image classification problems of interest. In this paper, we propose an automatic CNN architecture design method by using genetic algorithms, to effectively address the image classification tasks. The most merit of the proposed algorithm remains in its "automatic" characteristic that users do not need domain knowledge of CNNs when using the proposed algorithm, while they can still obtain a promising CNN architecture for the given images. The proposed algorithm is validated on widely used benchmark image classification datasets, by comparing to the state-of-the-art peer competitors covering eight manually-designed CNNs, seven automatic+manually tuning and five automatic CNN architecture design algorithms. The experimental results indicate the proposed algorithm outperforms the existing automatic CNN architecture design algorithms in terms of classification accuracy, parameter numbers and consumed computational resources. The proposed algorithm also shows the very comparable classification accuracy to the best one from manually-designed and automatic+manually tuning CNNs, while consumes much less of computational resource.
△ Less
Submitted 26 March, 2020; v1 submitted 11 August, 2018;
originally announced August 2018.
-
IGD Indicator-based Evolutionary Algorithm for Many-objective Optimization Problems
Authors:
Yanan Sun,
Gary G. Yen,
Zhang Yi
Abstract:
Inverted Generational Distance (IGD) has been widely considered as a reliable performance indicator to concurrently quantify the convergence and diversity of multi- and many-objective evolutionary algorithms. In this paper, an IGD indicator-based evolutionary algorithm for solving many-objective optimization problems (MaOPs) has been proposed. Specifically, the IGD indicator is employed in each ge…
▽ More
Inverted Generational Distance (IGD) has been widely considered as a reliable performance indicator to concurrently quantify the convergence and diversity of multi- and many-objective evolutionary algorithms. In this paper, an IGD indicator-based evolutionary algorithm for solving many-objective optimization problems (MaOPs) has been proposed. Specifically, the IGD indicator is employed in each generation to select the solutions with favorable convergence and diversity. In addition, a computationally efficient dominance comparison method is designed to assign the rank values of solutions along with three newly proposed proximity distance assignments. Based on these two designs, the solutions are selected from a global view by linear assignment mechanism to concern the convergence and diversity simultaneously. In order to facilitate the accuracy of the sampled reference points for the calculation of IGD indicator, we also propose an efficient decomposition-based nadir point estimation method for constructing the Utopian Pareto front which is regarded as the best approximate Pareto front for real-world MaOPs at the early stage of the evolution. To evaluate the performance, a series of experiments is performed on the proposed algorithm against a group of selected state-of-the-art many-objective optimization algorithms over optimization problems with $8$-, $15$-, and $20$-objective. Experimental results measured by the chosen performance metrics indicate that the proposed algorithm is very competitive in addressing MaOPs.
△ Less
Submitted 23 February, 2018;
originally announced February 2018.
-
Improved Regularity Model-based EDA for Many-objective Optimization
Authors:
Yanan Sun,
Gary G. Yen,
Zhang Yi
Abstract:
The performance of multi-objective evolutionary algorithms deteriorates appreciably in solving many-objective optimization problems which encompass more than three objectives. One of the known rationales is the loss of selection pressure which leads to the selected parents not generating promising offspring towards Pareto-optimal front with diversity. Estimation of distribution algorithms sample n…
▽ More
The performance of multi-objective evolutionary algorithms deteriorates appreciably in solving many-objective optimization problems which encompass more than three objectives. One of the known rationales is the loss of selection pressure which leads to the selected parents not generating promising offspring towards Pareto-optimal front with diversity. Estimation of distribution algorithms sample new solutions with a probabilistic model built from the statistics extracting over the existing solutions so as to mitigate the adverse impact of genetic operators. In this paper, an improved regularity-based estimation of distribution algorithm is proposed to effectively tackle unconstrained many-objective optimization problems. In the proposed algorithm, \emph{diversity repairing mechanism} is utilized to mend the areas where need non-dominated solutions with a closer proximity to the Pareto-optimal front. Then \emph{favorable solutions} are generated by the model built from the regularity of the solutions surrounding a group of representatives. These two steps collectively enhance the selection pressure which gives rise to the superior convergence of the proposed algorithm. In addition, dimension reduction technique is employed in the decision space to speed up the estimation search of the proposed algorithm. Finally, by assigning the Pareto-optimal solutions to the uniformly distributed reference vectors, a set of solutions with excellent diversity and convergence is obtained. To measure the performance, NSGA-III, GrEA, MOEA/D, HypE, MBN-EDA, and RM-MEDA are selected to perform comparison experiments over DTLZ and DTLZ$^-$ test suites with $3$-, $5$-, $8$-, $10$-, and $15$-objective. Experimental results quantified by the selected performance metrics reveal that the proposed algorithm shows considerable competitiveness in addressing unconstrained many-objective optimization problems.
△ Less
Submitted 23 February, 2018;
originally announced February 2018.
-
Evolving Unsupervised Deep Neural Networks for Learning Meaningful Representations
Authors:
Yanan Sun,
Gary G. Yen,
Zhang Yi
Abstract:
Deep Learning (DL) aims at learning the \emph{meaningful representations}. A meaningful representation refers to the one that gives rise to significant performance improvement of associated Machine Learning (ML) tasks by replacing the raw data as the input. However, optimal architecture design and model parameter estimation in DL algorithms are widely considered to be intractable. Evolutionary alg…
▽ More
Deep Learning (DL) aims at learning the \emph{meaningful representations}. A meaningful representation refers to the one that gives rise to significant performance improvement of associated Machine Learning (ML) tasks by replacing the raw data as the input. However, optimal architecture design and model parameter estimation in DL algorithms are widely considered to be intractable. Evolutionary algorithms are much preferable for complex and non-convex problems due to its inherent characteristics of gradient-free and insensitivity to local optimum. In this paper, we propose a computationally economical algorithm for evolving \emph{unsupervised deep neural networks} to efficiently learn \emph{meaningful representations}, which is very suitable in the current Big Data era where sufficient labeled data for training is often expensive to acquire. In the proposed algorithm, finding an appropriate architecture and the initialized parameter values for a ML task at hand is modeled by one computational efficient gene encoding approach, which is employed to effectively model the task with a large number of parameters. In addition, a local search strategy is incorporated to facilitate the exploitation search for further improving the performance. Furthermore, a small proportion labeled data is utilized during evolution search to guarantee the learnt representations to be meaningful. The performance of the proposed algorithm has been thoroughly investigated over classification tasks. Specifically, error classification rate on MNIST with $1.15\%$ is reached by the proposed algorithm consistently, which is a very promising result against state-of-the-art unsupervised DL algorithms.
△ Less
Submitted 23 February, 2018; v1 submitted 13 December, 2017;
originally announced December 2017.
-
A Particle Swarm Optimization-based Flexible Convolutional Auto-Encoder for Image Classification
Authors:
Yanan Sun,
Bing Xue,
Mengjie Zhang,
Gary G. Yen
Abstract:
Convolutional auto-encoders have shown their remarkable performance in stacking to deep convolutional neural networks for classifying image data during past several years. However, they are unable to construct the state-of-the-art convolutional neural networks due to their intrinsic architectures. In this regard, we propose a flexible convolutional auto-encoder by eliminating the constraints on th…
▽ More
Convolutional auto-encoders have shown their remarkable performance in stacking to deep convolutional neural networks for classifying image data during past several years. However, they are unable to construct the state-of-the-art convolutional neural networks due to their intrinsic architectures. In this regard, we propose a flexible convolutional auto-encoder by eliminating the constraints on the numbers of convolutional layers and pooling layers from the traditional convolutional auto-encoder. We also design an architecture discovery method by using particle swarm optimization, which is capable of automatically searching for the optimal architectures of the proposed flexible convolutional auto-encoder with much less computational resource and without any manual intervention. We use the designed architecture optimization algorithm to test the proposed flexible convolutional auto-encoder through utilizing one graphic processing unit card on four extensively used image classification datasets. Experimental results show that our work in this paper significantly outperform the peer competitors including the state-of-the-art algorithm.
△ Less
Submitted 10 November, 2018; v1 submitted 13 December, 2017;
originally announced December 2017.
-
Evolving Deep Convolutional Neural Networks for Image Classification
Authors:
Yanan Sun,
Bing Xue,
Mengjie Zhang,
Gary G. Yen
Abstract:
Evolutionary computation methods have been successfully applied to neural networks since two decades ago, while those methods cannot scale well to the modern deep neural networks due to the complicated architectures and large quantities of connection weights. In this paper, we propose a new method using genetic algorithms for evolving the architectures and connection weight initialization values o…
▽ More
Evolutionary computation methods have been successfully applied to neural networks since two decades ago, while those methods cannot scale well to the modern deep neural networks due to the complicated architectures and large quantities of connection weights. In this paper, we propose a new method using genetic algorithms for evolving the architectures and connection weight initialization values of a deep convolutional neural network to address image classification problems. In the proposed algorithm, an efficient variable-length gene encoding strategy is designed to represent the different building blocks and the unpredictable optimal depth in convolutional neural networks. In addition, a new representation scheme is developed for effectively initializing connection weights of deep convolutional neural networks, which is expected to avoid networks getting stuck into local minima which is typically a major issue in the backward gradient-based optimization. Furthermore, a novel fitness evaluation method is proposed to speed up the heuristic search with substantially less computational resource. The proposed algorithm is examined and compared with 22 existing algorithms on nine widely used image classification tasks, including the state-of-the-art methods. The experimental results demonstrate the remarkable superiority of the proposed algorithm over the state-of-the-art algorithms in terms of classification error rate and the number of parameters (weights).
△ Less
Submitted 10 March, 2019; v1 submitted 29 October, 2017;
originally announced October 2017.
-
Transfer Learning based Dynamic Multiobjective Optimization Algorithms
Authors:
Min Jiang,
Zhongqiang Huang,
Liming Qiu,
Wenzhen Huang,
Gary G. Yen
Abstract:
One of the major distinguishing features of the dynamic multiobjective optimization problems (DMOPs) is the optimization objectives will change over time, thus tracking the varying Pareto-optimal front becomes a challenge. One of the promising solutions is reusing the "experiences" to construct a prediction model via statistical machine learning approaches. However most of the existing methods ign…
▽ More
One of the major distinguishing features of the dynamic multiobjective optimization problems (DMOPs) is the optimization objectives will change over time, thus tracking the varying Pareto-optimal front becomes a challenge. One of the promising solutions is reusing the "experiences" to construct a prediction model via statistical machine learning approaches. However most of the existing methods ignore the non-independent and identically distributed nature of data used to construct the prediction model. In this paper, we propose an algorithmic framework, called Tr-DMOEA, which integrates transfer learning and population-based evolutionary algorithm for solving the DMOPs. This approach takes the transfer learning method as a tool to help reuse the past experience for speeding up the evolutionary process, and at the same time, any population based multiobjective algorithms can benefit from this integration without any extensive modifications. To verify this, we incorporate the proposed approach into the development of three well-known algorithms, nondominated sorting genetic algorithm II (NSGA-II), multiobjective particle swarm optimization (MOPSO), and the regularity model-based multiobjective estimation of distribution algorithm (RM-MEDA), and then employ twelve benchmark functions to test these algorithms as well as compare with some chosen state-of-the-art designs. The experimental results confirm the effectiveness of the proposed method through exploiting machine learning technology.
△ Less
Submitted 18 November, 2017; v1 submitted 19 December, 2016;
originally announced December 2016.