-
Formal Methods Meets Readability: Auto-Documenting JML Java Code
Authors:
Juan Carlos Recio Abad,
Ruben Saborido,
Francisco Chicano
Abstract:
This paper investigates whether formal specifications using Java Modeling Language (JML) can enhance the quality of Large Language Model (LLM)-generated Javadocs. While LLMs excel at producing documentation from code alone, we hypothesize that incorporating formally verified invariants yields more complete and accurate results. We present a systematic comparison of documentation generated from JML…
▽ More
This paper investigates whether formal specifications using Java Modeling Language (JML) can enhance the quality of Large Language Model (LLM)-generated Javadocs. While LLMs excel at producing documentation from code alone, we hypothesize that incorporating formally verified invariants yields more complete and accurate results. We present a systematic comparison of documentation generated from JML-annotated and non-annotated Java classes, evaluating quality through both automated metrics and expert analysis. Our findings demonstrate that JML significantly improves class-level documentation completeness, with more moderate gains at the method level. Formal specifications prove particularly effective in capturing complex class invariants and design contracts that are frequently overlooked in code-only documentation. A threshold effect emerges, where the benefits of JML become more pronounced for classes with richer sets of invariants. While JML enhances specification coverage, its impact on core descriptive quality is limited, suggesting that formal specifications primarily ensure comprehensive coverage rather than fundamentally altering implementation descriptions. These results offer actionable insights for software teams adopting formal methods in documentation workflows, highlighting scenarios where JML provides clear advantages. The study contributes to AI-assisted software documentation research by demonstrating how formal methods and LLMs can synergistically improve documentation quality.
△ Less
Submitted 10 June, 2025;
originally announced June 2025.
-
The Quantum Approximate Optimization Algorithm Can Require Exponential Time to Optimize Linear Functions
Authors:
Francisco Chicano,
Zakaria Abdelmoiz Dahi,
Gabriel Luque
Abstract:
QAOA is a hybrid quantum-classical algorithm to solve optimization problems in gate-based quantum computers. It is based on a variational quantum circuit that can be interpreted as a discretization of the annealing process that quantum annealers follow to find a minimum energy state of a given Hamiltonian. This ensures that QAOA must find an optimal solution for any given optimization problem when…
▽ More
QAOA is a hybrid quantum-classical algorithm to solve optimization problems in gate-based quantum computers. It is based on a variational quantum circuit that can be interpreted as a discretization of the annealing process that quantum annealers follow to find a minimum energy state of a given Hamiltonian. This ensures that QAOA must find an optimal solution for any given optimization problem when the number of layers, $p$, used in the variational quantum circuit tends to infinity. In practice, the number of layers is usually bounded by a small number. This is a must in current quantum computers of the NISQ era, due to the depth limit of the circuits they can run to avoid problems with decoherence and noise. In this paper, we show mathematical evidence that QAOA requires exponential time to solve linear functions when the number of layers is less than the number of different coefficients of the linear function $n$. We conjecture that QAOA needs exponential time to find the global optimum of linear functions for any constant value of $p$, and that the runtime is linear only if $p \geq n$. We conclude that we need new quantum algorithms to reach quantum supremacy in quantum optimization.
△ Less
Submitted 26 May, 2025; v1 submitted 9 May, 2025;
originally announced May 2025.
-
Generate more than one child in your co-evolutionary semi-supervised learning GAN
Authors:
Francisco Sedeño,
Jamal Toutouh,
Francisco Chicano
Abstract:
Generative Adversarial Networks (GANs) are very useful methods to address semi-supervised learning (SSL) datasets, thanks to their ability to generate samples similar to real data. This approach, called SSL-GAN has attracted many researchers in the last decade. Evolutionary algorithms have been used to guide the evolution and training of SSL-GANs with great success. In particular, several co-evolu…
▽ More
Generative Adversarial Networks (GANs) are very useful methods to address semi-supervised learning (SSL) datasets, thanks to their ability to generate samples similar to real data. This approach, called SSL-GAN has attracted many researchers in the last decade. Evolutionary algorithms have been used to guide the evolution and training of SSL-GANs with great success. In particular, several co-evolutionary approaches have been applied where the two networks of a GAN (the generator and the discriminator) are evolved in separate populations. The co-evolutionary approaches published to date assume some spatial structure of the populations, based on the ideas of cellular evolutionary algorithms. They also create one single individual per generation and follow a generational replacement strategy in the evolution. In this paper, we re-consider those algorithmic design decisions and propose a new co-evolutionary approach, called Co-evolutionary Elitist SSL-GAN (CE-SSLGAN), with panmictic population, elitist replacement, and more than one individual in the offspring. We evaluate the performance of our proposed method using three standard benchmark datasets. The results show that creating more than one offspring per population and using elitism improves the results in comparison with a classical SSL-GAN.
△ Less
Submitted 29 April, 2025;
originally announced April 2025.
-
On Revealing the Hidden Problem Structure in Real-World and Theoretical Problems Using Walsh Coefficient Influence
Authors:
M. W. Przewozniczek,
F. Chicano,
R. Tinós,
J. Nalepa,
B. Ruszczak,
A. M. Wijata
Abstract:
Gray-box optimization employs Walsh decomposition to obtain non-linear variable dependencies and utilize them to propose masks of variables that have a joint non-linear influence on fitness value. These masks significantly improve the effectiveness of variation operators. In some problems, all variables are non-linearly dependent, making the aforementioned masks useless. We analyze the features of…
▽ More
Gray-box optimization employs Walsh decomposition to obtain non-linear variable dependencies and utilize them to propose masks of variables that have a joint non-linear influence on fitness value. These masks significantly improve the effectiveness of variation operators. In some problems, all variables are non-linearly dependent, making the aforementioned masks useless. We analyze the features of the real-world instances of such problems and show that many of their dependencies may have noise-like origins. Such noise-caused dependencies are irrelevant to the optimization process and can be ignored. To identify them, we propose extending the use of Walsh decomposition by measuring variable dependency strength that allows the construction of the weighted dynamic Variable Interaction Graph (wdVIG). wdVIGs adjust the dependency strength to mixed individuals. They allow the filtering of irrelevant dependencies and re-enable using dependency-based masks by variation operators. We verify the wdVIG potential on a large benchmark suite. For problems with noise, the wdVIG masks can improve the optimizer's effectiveness. If all dependencies are relevant for the optimization, i.e., the problem is not noised, the influence of wdVIG masks is similar to that of state-of-the-art structures of this kind.
△ Less
Submitted 16 April, 2025;
originally announced April 2025.
-
Moving between high-quality optima using multi-satisfiability characteristics in hard-to-solve Max3Sat instances
Authors:
J. Piatek,
M. W. Przewozniczek,
F. Chicano,
R. Tinós
Abstract:
Gray-box optimization proposes effective and efficient optimizers of general use. To this end, it leverages information about variable dependencies and the subfunction-based problem representation. These approaches were already shown effective by enabling \textit{tunnelling} between local optima even if these moves require the modification of many dependent variables. Tunnelling is useful in solvi…
▽ More
Gray-box optimization proposes effective and efficient optimizers of general use. To this end, it leverages information about variable dependencies and the subfunction-based problem representation. These approaches were already shown effective by enabling \textit{tunnelling} between local optima even if these moves require the modification of many dependent variables. Tunnelling is useful in solving the maximum satisfiability problem (MaxSat), which can be reformulated to Max3Sat. Since many real-world problems can be brought to solving the MaxSat/Max3Sat instances, it is important to solve them effectively and efficiently. Therefore, we focus on Max3Sat instances for which tunnelling fails to introduce improving moves between locally optimal high-quality solutions and the region of globally optimal solutions. We analyze the features of such instances on the ground of phase transitions. Based on these observations, we propose manipulating clause-satisfiability characteristics that allow connecting high-quality solutions distant in the solution space. We utilize multi-satisfiability characteristics in the optimizer built from typical gray-box mechanisms. The experimental study shows that the proposed optimizer can solve those Max3Sat instances that are out of the grasp of state-of-the-art gray-box optimizers. At the same time, it remains effective for instances that have already been successfully solved by gray-box.
△ Less
Submitted 16 April, 2025;
originally announced April 2025.
-
Combinatorial Optimization with Quantum Computers
Authors:
Francisco Chicano,
Gabiel Luque,
Zakaria Abdelmoiz Dahi,
Rodrigo Gil-Merino
Abstract:
Quantum computers leverage the principles of quantum mechanics to do computation with a potential advantage over classical computers. While a single classical computer transforms one particular binary input into an output after applying one operator to the input, a quantum computer can apply the operator to a superposition of binary strings to provide a superposition of binary outputs, doing compu…
▽ More
Quantum computers leverage the principles of quantum mechanics to do computation with a potential advantage over classical computers. While a single classical computer transforms one particular binary input into an output after applying one operator to the input, a quantum computer can apply the operator to a superposition of binary strings to provide a superposition of binary outputs, doing computation apparently in parallel. This feature allows quantum computers to speed up the computation compared to classical algorithms. Unsurprisingly, quantum algorithms have been proposed to solve optimization problems in quantum computers. Furthermore, a family of quantum machines called quantum annealers are specially designed to solve optimization problems. In this paper, we provide an introduction to quantum optimization from a practical point of view. We introduce the reader to the use of quantum annealers and quantum gate-based machines to solve optimization problems.
△ Less
Submitted 14 March, 2025; v1 submitted 20 December, 2024;
originally announced December 2024.
-
Iterated Local Search with Linkage Learning
Authors:
Renato Tinós,
Michal W. Przewozniczek,
Darrell Whitley,
Francisco Chicano
Abstract:
In pseudo-Boolean optimization, a variable interaction graph represents variables as vertices, and interactions between pairs of variables as edges. In black-box optimization, the variable interaction graph may be at least partially discovered by using empirical linkage learning techniques. These methods never report false variable interactions, but they are computationally expensive. The recently…
▽ More
In pseudo-Boolean optimization, a variable interaction graph represents variables as vertices, and interactions between pairs of variables as edges. In black-box optimization, the variable interaction graph may be at least partially discovered by using empirical linkage learning techniques. These methods never report false variable interactions, but they are computationally expensive. The recently proposed local search with linkage learning discovers the partial variable interaction graph as a side-effect of iterated local search. However, information about the strength of the interactions is not learned by the algorithm. We propose local search with linkage learning 2, which builds a weighted variable interaction graph that stores information about the strength of the interaction between variables. The weighted variable interaction graph can provide new insights about the optimization problem and behavior of optimizers. Experiments with NK landscapes, knapsack problem, and feature selection show that local search with linkage learning 2 is able to efficiently build weighted variable interaction graphs. In particular, experiments with feature selection show that the weighted variable interaction graphs can be used for visualizing the feature interactions in machine learning. Additionally, new transformation operators that exploit the interactions between variables can be designed. We illustrate this ability by proposing a new perturbation operator for iterated local search.
△ Less
Submitted 2 October, 2024;
originally announced October 2024.
-
Generalizing and Unifying Gray-box Combinatorial Optimization Operators
Authors:
Francisco Chicano,
Darrell Whitley,
Gabriela Ochoa,
Renato Tinós
Abstract:
Gray-box optimization leverages the information available about the mathematical structure of an optimization problem to design efficient search operators. Efficient hill climbers and crossover operators have been proposed in the domain of pseudo-Boolean optimization and also in some permutation problems. However, there is no general rule on how to design these efficient operators in different rep…
▽ More
Gray-box optimization leverages the information available about the mathematical structure of an optimization problem to design efficient search operators. Efficient hill climbers and crossover operators have been proposed in the domain of pseudo-Boolean optimization and also in some permutation problems. However, there is no general rule on how to design these efficient operators in different representation domains. This paper proposes a general framework that encompasses all known gray-box operators for combinatorial optimization problems. The framework is general enough to shed light on the design of new efficient operators for new problems and representation domains. We also unify the proofs of efficiency for gray-box hill climbers and crossovers and show that the mathematical property explaining the speed-up of gray-box crossover operators, also explains the efficient identification of improving moves in gray-box hill climbers. We illustrate the power of the new framework by proposing an efficient hill climber and crossover for two related permutation problems: the Linear Ordering Problem and the Single Machine Total Weighted Tardiness Problem.
△ Less
Submitted 9 July, 2024;
originally announced July 2024.
-
Effective anytime algorithm for multiobjective combinatorial optimization problems
Authors:
Miguel Ángel Domínguez-Ríos,
Francisco Chicano,
Enrique Alba
Abstract:
In multiobjective optimization, the result of an optimization algorithm is a set of efficient solutions from which the decision maker selects one. It is common that not all the efficient solutions can be computed in a short time and the search algorithm has to be stopped prematurely to analyze the solutions found so far. A set of efficient solutions that are well-spread in the objective space is p…
▽ More
In multiobjective optimization, the result of an optimization algorithm is a set of efficient solutions from which the decision maker selects one. It is common that not all the efficient solutions can be computed in a short time and the search algorithm has to be stopped prematurely to analyze the solutions found so far. A set of efficient solutions that are well-spread in the objective space is preferred to provide the decision maker with a great variety of solutions. However, just a few exact algorithms in the literature exist with the ability to provide such a well-spread set of solutions at any moment: we call them anytime algorithms. We propose a new exact anytime algorithm for multiobjective combinatorial optimization combining three novel ideas to enhance the anytime behavior. We compare the proposed algorithm with those in the state-of-the-art for anytime multiobjective combinatorial optimization using a set of 480 instances from different well-known benchmarks and four different performance measures: the overall non-dominated vector generation ratio, the hypervolume, the general spread and the additive epsilon indicator. A comprehensive experimental study reveals that our proposal outperforms the previous algorithms in most of the instances.
△ Less
Submitted 6 February, 2024;
originally announced March 2024.
-
Automatizing Software Cognitive Complexity Reduction through Integer Linear Programming
Authors:
Rubén Saborido,
Javier Ferrer,
Francisco Chicano
Abstract:
Reducing the cognitive complexity of a piece of code to a given threshold is not trivial. Recently, we modeled software cognitive complexity reduction as an optimization problem and we proposed an approach to assist developers on this task. This approach enumerates sequences of code extraction refactoring operations until a stopping criterion is met. As a result, it returns the minimal sequence of…
▽ More
Reducing the cognitive complexity of a piece of code to a given threshold is not trivial. Recently, we modeled software cognitive complexity reduction as an optimization problem and we proposed an approach to assist developers on this task. This approach enumerates sequences of code extraction refactoring operations until a stopping criterion is met. As a result, it returns the minimal sequence of code extraction refactoring operations that is able to reduce the cognitive complexity of a code to the given threshold. However, exhaustive enumeration algorithms fail to scale with the code size. The number of refactoring plans can grow exponentially with the number of lines of code. In this paper, instead of enumerating sequences of code extraction refactoring operations, we model the cognitive complexity reduction as an Integer Linear Programming problem. This opens the door to the use of efficient solvers to find optimal solutions in large programs.
△ Less
Submitted 8 February, 2024;
originally announced February 2024.
-
CMSA algorithm for solving the prioritized pairwise test data generation problem in software product lines
Authors:
Javier Ferrer,
Francisco Chicano,
José Antonio Ortega Toro
Abstract:
In Software Product Lines (SPLs) it may be difficult or even impossible to test all the products of the family because of the large number of valid feature combinations that may exist. Thus, we want to find a minimal subset of the product family that allows us to test all these possible combinations (pairwise). Furthermore, when testing a single product is a great effort, it is desirable to first…
▽ More
In Software Product Lines (SPLs) it may be difficult or even impossible to test all the products of the family because of the large number of valid feature combinations that may exist. Thus, we want to find a minimal subset of the product family that allows us to test all these possible combinations (pairwise). Furthermore, when testing a single product is a great effort, it is desirable to first test products composed of a set of priority features. This problem is called Prioritized Pairwise Test Data Generation Problem.
State-of-the-art algorithms based on Integer Linear Programming for this problema are faster enough for small and medium instances. However, there exists some real instances that are too large to be computed with these algorithms in a reasonable time because of the exponential growth of the number of candidate solutions. Also, these heuristics not always lead us to the best solutions. In this work we propose a new approach based on a hybrid metaheuristic algorithm called Construct, Merge, Solve & Adapt. We compare this matheuristic with four algorithms: a Hybrid algorithm based on Integer Linear Programming ((HILP), a Hybrid algorithm based on Integer Nonlinear Programming (HINLP), the Parallel Prioritized Genetic Solver (PPGS), and a greedy algorithm called prioritized-ICPL. The analysis reveals that CMSA results in statistically significantly better quality solutions in most instances and for most levels of weighted coverage, although it requires more execution time.
△ Less
Submitted 7 February, 2024;
originally announced February 2024.
-
Efficient anytime algorithms to solve the bi-objective Next Release Problem
Authors:
Miguel Ángel Domínguez-Ríos,
Francisco Chicano,
Enrique Alba,
Isabel María del Águila,
José del Sagrado
Abstract:
The Next Release Problem consists in selecting a subset of requirements to develop in the next release of a software product. The selection should be done in a way that maximizes the satisfaction of the stakeholders while the development cost is minimized and the constraints of the requirements are fulfilled. Recent works have solved the problem using exact methods based on Integer Linear Programm…
▽ More
The Next Release Problem consists in selecting a subset of requirements to develop in the next release of a software product. The selection should be done in a way that maximizes the satisfaction of the stakeholders while the development cost is minimized and the constraints of the requirements are fulfilled. Recent works have solved the problem using exact methods based on Integer Linear Programming. In practice, there is no need to compute all the efficient solutions of the problem; a well-spread set in the objective space is more convenient for the decision maker. The exact methods used in the past to find the complete Pareto front explore the objective space in a lexicographic order or use a weighted sum of the objectives to solve a single-objective problem, finding only supported solutions. In this work, we propose five new methods that maintain a well-spread set of solutions at any time during the search, so that the decision maker can stop the algorithm when a large enough set of solutions is found. The methods are called anytime due to this feature. They find both supported and non-supported solutions, and can complete the whole Pareto front if the time provided is long enough.
△ Less
Submitted 7 February, 2024;
originally announced February 2024.
-
Using metaheuristics for the location of bicycle stations
Authors:
Christian Cintrano,
Francisco Chicano,
Enrique Alba
Abstract:
In this work, we solve the problem of finding the best locations to place stations for depositing/collecting shared bicycles. To do this, we model the problem as the p-median problem, that is a major existing localization problem in optimization. The p-median problem seeks to place a set of facilities (bicycle stations) in a way that minimizes the distance between a set of clients (citizens) and t…
▽ More
In this work, we solve the problem of finding the best locations to place stations for depositing/collecting shared bicycles. To do this, we model the problem as the p-median problem, that is a major existing localization problem in optimization. The p-median problem seeks to place a set of facilities (bicycle stations) in a way that minimizes the distance between a set of clients (citizens) and their closest facility (bike station). We have used a genetic algorithm, iterated local search, particle swarm optimization, simulated annealing, and variable neighbourhood search, to find the best locations for the bicycle stations and study their comparative advantages. We use irace to parameterize each algorithm automatically, to contribute with a methodology to fine-tune algorithms automatically. We have also studied different real data (distance and weights) from diverse open data sources from a real city, Malaga (Spain), hopefully leading to a final smart city application. We have compared our results with the implemented solution in Malaga. Finally, we have analyzed how we can use our proposal to improve the existing system in the city by adding more stations.
△ Less
Submitted 6 February, 2024;
originally announced February 2024.
-
Dynastic Potential Crossover Operator
Authors:
Francisco Chicano,
Gabriela Ochoa,
Darrell Whitley,
Renato Tinós
Abstract:
An optimal recombination operator for two parent solutions provides the best solution among those that take the value for each variable from one of the parents (gene transmission property). If the solutions are bit strings, the offspring of an optimal recombination operator is optimal in the smallest hyperplane containing the two parent solutions. Exploring this hyperplane is computationally costl…
▽ More
An optimal recombination operator for two parent solutions provides the best solution among those that take the value for each variable from one of the parents (gene transmission property). If the solutions are bit strings, the offspring of an optimal recombination operator is optimal in the smallest hyperplane containing the two parent solutions. Exploring this hyperplane is computationally costly, in general, requiring exponential time in the worst case. However, when the variable interaction graph of the objective function is sparse, exploration can be done in polynomial time.
In this paper, we present a recombination operator, called Dynastic Potential Crossover (DPX), that runs in polynomial time and behaves like an optimal recombination operator for low-epistasis combinatorial problems. We compare this operator, both theoretically and experimentally, with traditional crossover operators, like uniform crossover and network crossover, and with two recently defined efficient recombination operators: partition crossover and articulation points partition crossover. The empirical comparison uses NKQ Landscapes and MAX-SAT instances. DPX outperforms the other crossover operators in terms of quality of the offspring and provides better results included in a trajectory and a population-based metaheuristic, but it requires more time and memory to compute the offspring.
△ Less
Submitted 6 February, 2024;
originally announced February 2024.
-
NK Hybrid Genetic Algorithm for Clustering
Authors:
Renato Tinós,
Liang Zhao,
Francisco Chicano,
Darrell Whitley
Abstract:
The NK hybrid genetic algorithm for clustering is proposed in this paper. In order to evaluate the solutions, the hybrid algorithm uses the NK clustering validation criterion 2 (NKCV2). NKCV2 uses information about the disposition of $N$ small groups of objects. Each group is composed of $K+1$ objects of the dataset. Experimental results show that density-based regions can be identified by using N…
▽ More
The NK hybrid genetic algorithm for clustering is proposed in this paper. In order to evaluate the solutions, the hybrid algorithm uses the NK clustering validation criterion 2 (NKCV2). NKCV2 uses information about the disposition of $N$ small groups of objects. Each group is composed of $K+1$ objects of the dataset. Experimental results show that density-based regions can be identified by using NKCV2 with fixed small $K$. In NKCV2, the relationship between decision variables is known, which in turn allows us to apply gray box optimization. Mutation operators, a partition crossover, and a local search strategy are proposed, all using information about the relationship between decision variables. In partition crossover, the evaluation function is decomposed into $q$ independent components; partition crossover then deterministically returns the best among $2^q$ possible offspring with computational complexity $O(N)$. The NK hybrid genetic algorithm allows the detection of clusters with arbitrary shapes and the automatic estimation of the number of clusters. In the experiments, the NK hybrid genetic algorithm produced very good results when compared to another genetic algorithm approach and to state-of-art clustering algorithms.
△ Less
Submitted 6 February, 2024;
originally announced February 2024.
-
Optimising Communication Overhead in Federated Learning Using NSGA-II
Authors:
José Ángel Morell,
Zakaria Abdelmoiz Dahi,
Francisco Chicano,
Gabriel Luque,
Enrique Alba
Abstract:
Federated learning is a training paradigm according to which a server-based model is cooperatively trained using local models running on edge devices and ensuring data privacy. These devices exchange information that induces a substantial communication load, which jeopardises the functioning efficiency. The difficulty of reducing this overhead stands in achieving this without decreasing the model'…
▽ More
Federated learning is a training paradigm according to which a server-based model is cooperatively trained using local models running on edge devices and ensuring data privacy. These devices exchange information that induces a substantial communication load, which jeopardises the functioning efficiency. The difficulty of reducing this overhead stands in achieving this without decreasing the model's efficiency (contradictory relation). To do so, many works investigated the compression of the pre/mid/post-trained models and the communication rounds, separately, although they jointly contribute to the communication overload. Our work aims at optimising communication overhead in federated learning by (I) modelling it as a multi-objective problem and (II) applying a multi-objective optimization algorithm (NSGA-II) to solve it. To the best of the author's knowledge, this is the first work that \texttt{(I)} explores the add-in that evolutionary computation could bring for solving such a problem, and \texttt{(II)} considers both the neuron and devices features together. We perform the experimentation by simulating a server/client architecture with 4 slaves. We investigate both convolutional and fully-connected neural networks with 12 and 3 layers, 887,530 and 33,400 weights, respectively. We conducted the validation on the \texttt{MNIST} dataset containing 70,000 images. The experiments have shown that our proposal could reduce communication by 99% and maintain an accuracy equal to the one obtained by the FedAvg Algorithm that uses 100% of communications.
△ Less
Submitted 1 April, 2022;
originally announced April 2022.
-
The Asteroid Routing Problem: A Benchmark for Expensive Black-Box Permutation Optimization
Authors:
Manuel López-Ibáñez,
Francisco Chicano,
Rodrigo Gil-Merino
Abstract:
Inspired by the recent 11th Global Trajectory Optimisation Competition, this paper presents the asteroid routing problem (ARP) as a realistic benchmark of algorithms for expensive bound-constrained black-box optimization in permutation space. Given a set of asteroids' orbits and a departure epoch, the goal of the ARP is to find the optimal sequence for visiting the asteroids, starting from Earth's…
▽ More
Inspired by the recent 11th Global Trajectory Optimisation Competition, this paper presents the asteroid routing problem (ARP) as a realistic benchmark of algorithms for expensive bound-constrained black-box optimization in permutation space. Given a set of asteroids' orbits and a departure epoch, the goal of the ARP is to find the optimal sequence for visiting the asteroids, starting from Earth's orbit, in order to minimize both the cost, measured as the sum of the magnitude of velocity changes required to complete the trip, and the time, measured as the time elapsed from the departure epoch until visiting the last asteroid. We provide open-source code for generating instances of arbitrary sizes and evaluating solutions to the problem. As a preliminary analysis, we compare the results of two methods for expensive black-box optimization in permutation spaces, namely, Combinatorial Efficient Global Optimization (CEGO), a Bayesian optimizer based on Gaussian processes, and Unbalanced Mallows Model (UMM), an estimation-of-distribution algorithm based on probabilistic Mallows models. We investigate the best permutation representation for each algorithm, either rank-based or order-based. Moreover, we analyze the effect of providing a good initial solution, generated by a greedy nearest neighbor heuristic, on the performance of the algorithms. The results suggest directions for improvements in the algorithms being compared.
△ Less
Submitted 19 April, 2022; v1 submitted 29 March, 2022;
originally announced March 2022.
-
Anti-patterns and the energy efficiency of Android applications
Authors:
Rodrigo Morales,
Ruben Saborido,
Foutse Khomh,
Francisco Chicano,
Giuliano Antoniol
Abstract:
The boom in mobile apps has changed the traditional landscape of software development by introducing new challenges due to the limited resources of mobile devices, e.g., memory, CPU, network bandwidth and battery. The energy consumption of mobile apps is nowadays a hot topic and researchers are actively investigating the role of coding practices on energy efficiency. Recent studies suggest that de…
▽ More
The boom in mobile apps has changed the traditional landscape of software development by introducing new challenges due to the limited resources of mobile devices, e.g., memory, CPU, network bandwidth and battery. The energy consumption of mobile apps is nowadays a hot topic and researchers are actively investigating the role of coding practices on energy efficiency. Recent studies suggest that design quality can conflict with energy efficiency. Therefore, it is important to take into account energy efficiency when evolving the design of a mobile app. The research community has proposed approaches to detect and remove anti-patterns (i.e., poor solutions to design and implementation problems) in software systems but, to the best of our knowledge, none of these approaches have included anti-patterns that are specific to mobile apps and--or considered the energy efficiency of apps. In this paper, we fill this gap in the literature by analyzing the impact of eight type of anti-patterns on a testbed of 59 android apps extracted from F-Droid. First, we (1) analyze the impact of anti-patterns in mobile apps with respect to energy efficiency; then (2) we study the impact of different types of anti-patterns on energy efficiency. We found that then energy consumption of apps containing anti-patterns and not (refactored apps) is statistically different. Moreover, we find that the impact of refactoring anti-patterns can be positive (7 type of anti-patterns) or negative (2 type of anti-patterns). Therefore, developers should consider the impact on energy efficiency of refactoring when applying maintenance activities.
△ Less
Submitted 19 October, 2016; v1 submitted 18 October, 2016;
originally announced October 2016.
-
Efficient Hill-Climber for Multi-Objective Pseudo-Boolean Optimization
Authors:
Francisco Chicano,
Darrell Whitley,
Renato Tinos
Abstract:
Local search algorithms and iterated local search algorithms are a basic technique. Local search can be a stand along search methods, but it can also be hybridized with evolutionary algorithms. Recently, it has been shown that it is possible to identify improving moves in Hamming neighborhoods for k-bounded pseudo-Boolean optimization problems in constant time. This means that local search does no…
▽ More
Local search algorithms and iterated local search algorithms are a basic technique. Local search can be a stand along search methods, but it can also be hybridized with evolutionary algorithms. Recently, it has been shown that it is possible to identify improving moves in Hamming neighborhoods for k-bounded pseudo-Boolean optimization problems in constant time. This means that local search does not need to enumerate neighborhoods to find improving moves. It also means that evolutionary algorithms do not need to use random mutation as a operator, except perhaps as a way to escape local optima. In this paper, we show how improving moves can be identified in constant time for multiobjective problems that are expressed as k-bounded pseudo-Boolean functions. In particular, multiobjective forms of NK Landscapes and Mk Landscapes are considered.
△ Less
Submitted 27 January, 2016;
originally announced January 2016.
-
Optimal Neuron Selection: NK Echo State Networks for Reinforcement Learning
Authors:
Darrell Whitley,
Renato Tinós,
Francisco Chicano
Abstract:
This paper introduces the NK Echo State Network. The problem of learning in the NK Echo State Network is reduced to the problem of optimizing a special form of a Spin Glass Problem known as an NK Landscape. No weight adjustment is used; all learning is accomplished by spinning up (turning on) or spinning down (turning off) neurons in order to find a combination of neurons that work together to ach…
▽ More
This paper introduces the NK Echo State Network. The problem of learning in the NK Echo State Network is reduced to the problem of optimizing a special form of a Spin Glass Problem known as an NK Landscape. No weight adjustment is used; all learning is accomplished by spinning up (turning on) or spinning down (turning off) neurons in order to find a combination of neurons that work together to achieve the desired computation. For special types of NK Landscapes, an exact global solution can be obtained in polynomial time using dynamic programming. The NK Echo State Network is applied to a reinforcement learning problem requiring a recurrent network: balancing two poles on a cart given no velocity information. Empirical results shows that the NK Echo State Network learns very rapidly and yields very good generalization.
△ Less
Submitted 7 May, 2015;
originally announced May 2015.
-
A Hitchhiker's Guide to Search-Based Software Engineering for Software Product Lines
Authors:
Roberto E. Lopez-Herrejon,
Javier Ferrer,
Francisco Chicano,
Lukas Linsbauer,
Alexander Egyed,
Enrique Alba
Abstract:
Search Based Software Engineering (SBSE) is an emerging discipline that focuses on the application of search-based optimization techniques to software engineering problems. The capacity of SBSE techniques to tackle problems involving large search spaces make their application attractive for Software Product Lines (SPLs). In recent years, several publications have appeared that apply SBSE technique…
▽ More
Search Based Software Engineering (SBSE) is an emerging discipline that focuses on the application of search-based optimization techniques to software engineering problems. The capacity of SBSE techniques to tackle problems involving large search spaces make their application attractive for Software Product Lines (SPLs). In recent years, several publications have appeared that apply SBSE techniques to SPL problems. In this paper, we present the results of a systematic mapping study of such publications. We identified the stages of the SPL life cycle where SBSE techniques have been used, what case studies have been employed and how they have been analysed. This mapping study revealed potential venues for further research as well as common misunderstanding and pitfalls when applying SBSE techniques that we address by providing a guideline for researchers and practitioners interested in exploiting these techniques.
△ Less
Submitted 11 June, 2014;
originally announced June 2014.
-
Towards a Benchmark and a Comparison Framework for Combinatorial Interaction Testing of Software Product Lines
Authors:
Roberto E. Lopez-Herrejon,
Javier Ferrer,
Francisco Chicano,
Evelyn Nicole Haslinger,
Alexander Egyed,
Enrique Alba
Abstract:
As Software Product Lines (SPLs) are becoming a more pervasive development practice, their effective testing is becoming a more important concern. In the past few years many SPL testing approaches have been proposed, among them, are those that support Combinatorial Interaction Testing (CIT) whose premise is to select a group of products where faults, due to feature interactions, are more likely to…
▽ More
As Software Product Lines (SPLs) are becoming a more pervasive development practice, their effective testing is becoming a more important concern. In the past few years many SPL testing approaches have been proposed, among them, are those that support Combinatorial Interaction Testing (CIT) whose premise is to select a group of products where faults, due to feature interactions, are more likely to occur. Many CIT techniques for SPL testing have been put forward; however, no systematic and comprehensive comparison among them has been performed. To achieve such goal two items are important: a common benchmark of feature models, and an adequate comparison framework. In this research-in-progress paper, we propose 19 feature models as the base of a benchmark, which we apply to three different techniques in order to analyze the comparison framework proposed by Perrouin et al. We identify the shortcomings of this framework and elaborate alternatives for further study.
△ Less
Submitted 21 January, 2014;
originally announced January 2014.
-
Fitness Probability Distribution of Bit-Flip Mutation
Authors:
Francisco Chicano,
Andrew M. Sutton,
L. Darrell Whitley,
Enrique Alba
Abstract:
Bit-flip mutation is a common mutation operator for evolutionary algorithms applied to optimize functions over binary strings. In this paper, we develop results from the theory of landscapes and Krawtchouk polynomials to exactly compute the probability distribution of fitness values of a binary string undergoing uniform bit-flip mutation. We prove that this probability distribution can be expresse…
▽ More
Bit-flip mutation is a common mutation operator for evolutionary algorithms applied to optimize functions over binary strings. In this paper, we develop results from the theory of landscapes and Krawtchouk polynomials to exactly compute the probability distribution of fitness values of a binary string undergoing uniform bit-flip mutation. We prove that this probability distribution can be expressed as a polynomial in p, the probability of flipping each bit. We analyze these polynomials and provide closed-form expressions for an easy linear problem (Onemax), and an NP-hard problem, MAX-SAT. We also discuss some implications of the results for runtime analysis.
△ Less
Submitted 11 September, 2013;
originally announced September 2013.
-
Local Optima Networks, Landscape Autocorrelation and Heuristic Search Performance
Authors:
Francisco Chicano,
Fabio Daolio,
Gabriela Ochoa,
Sébastien Verel,
Marco Tomassini,
Enrique Alba
Abstract:
Recent developments in fitness landscape analysis include the study of Local Optima Networks (LON) and applications of the Elementary Landscapes theory. This paper represents a first step at combining these two tools to explore their ability to forecast the performance of search algorithms. We base our analysis on the Quadratic Assignment Problem (QAP) and conduct a large statistical study over 60…
▽ More
Recent developments in fitness landscape analysis include the study of Local Optima Networks (LON) and applications of the Elementary Landscapes theory. This paper represents a first step at combining these two tools to explore their ability to forecast the performance of search algorithms. We base our analysis on the Quadratic Assignment Problem (QAP) and conduct a large statistical study over 600 generated instances of different types. Our results reveal interesting links between the network measures, the autocorrelation measures and the performance of heuristic search algorithms.
△ Less
Submitted 15 October, 2012;
originally announced October 2012.
-
Elementary Components of the Quadratic Assignment Problem
Authors:
Francisco Chicano,
Gabriel Luque,
Enrique Alba
Abstract:
The Quadratic Assignment Problem (QAP) is a well-known NP-hard combinatorial optimization problem that is at the core of many real-world optimization problems. We prove that QAP can be written as the sum of three elementary landscapes when the swap neighborhood is used. We present a closed formula for each of the three elementary components and we compute bounds for the autocorrelation coefficient…
▽ More
The Quadratic Assignment Problem (QAP) is a well-known NP-hard combinatorial optimization problem that is at the core of many real-world optimization problems. We prove that QAP can be written as the sum of three elementary landscapes when the swap neighborhood is used. We present a closed formula for each of the three elementary components and we compute bounds for the autocorrelation coefficient.
△ Less
Submitted 22 September, 2011;
originally announced September 2011.