Über dieses Buch

This book constitutes the thoroughly refereed post-conference proceedings of the 9th International Conference on Learning and Optimization, LION 9, which was held in Lille, France, in January 2015.

The 31 contributions presented were carefully reviewed and selected for inclusion in this book. The papers address all fields between machine learning, artificial intelligence, mathematical programming and algorithms for hard optimization problems. Special focus is given to algorithm selection and configuration, learning, fitness landscape, applications, dynamic optimization, multi-objective, max-clique problems, bayesian optimization and global optimization, data mining and - in a special session - also on dynamic optimization.

Anzeige

Inhaltsverzeichnis

Frontmatter

In view of the increasing importance of hardware parallelism, a natural extension of per-instance algorithm selection is to select a set of algorithms to be run in parallel on a given problem instance, based on features of that instance. Here, we explore how existing algorithm selection techniques can be effectively parallelized. To this end, we leverage the machine learning models used by existing sequential algorithm selectors, such as

3S

,

ISAC

,

SATzilla

and

ME-ASP

, and modify their selection procedures to produce a ranking of the given candidate algorithms; we then select the top

n

algorithms under this ranking to be run in parallel on

n

processing units. Furthermore, we adapt the pre-solving schedules obtained by

aspeed

to be effective in a parallel setting with different time budgets for each processing unit. Our empirical results demonstrate that, using 4 processing units, the best of our methods achieves a 12-fold average speedup over the best single solver on a broad set of challenging scenarios from the algorithm selection library.

We present an algorithm selection benchmark based on optimal search algorithms for solving the container pre-marshalling problem (CPMP), an NP-hard problem from the field of container terminal optimization. Novel features are introduced and then systematically expanded through the recently proposed approach of latent feature analysis. The CPMP benchmark is interesting, as it involves a homogeneous set of parameterized algorithms that nonetheless result in a diverse range of performances. We present computational results using a state-of-the-art portfolio technique, thus providing a baseline for the benchmark.

] is to create a mixture of diverse algorithms that complement each other’s strength so as to solve a diverse set of problem instances. Algorithm portfolios have taken on a new and practical meaning today with the wide availability of multi-core processors: from an enterprise perspective, the interest is to make best use of parallel machines within the organization by running different algorithms simultaneously on different cores to solve a given problem instance. Parallel execution of a portfolio of algorithms as suggested by [

2

,

3

] a number of years ago has thus become a practical computing paradigm.

A major problem in deep learning is identifying appropriate hyperparameter configurations for deep architectures. This issue is important because: (1) inappropriate hyperparameter configurations will lead to mediocre performance; (2) little expert experience is available to make an informed decision. Random search is a straightforward choice for this problem; however, expensive time cost for each test has made numerous trails impractical. The main strategy of our solution has been based on data modeling via random forest, which is used as a tool to analyze data characteristics of performance of deep architectures with respect to hyperparameter variants and to explore underlying interactions of hyperparameters. This is a general method suitable for all types of deep architecture. Our approach is tested by using deep belief network: the error rate reduced from

Inspired by methods and theoretical results from parameterised algorithmics, we improve the state of the art in solving

Cluster Editing

, a prominent

NP

-hard clustering problem with applications in computational biology and beyond. In particular, we demonstrate that an extension of a certain preprocessing algorithm, called the

$$(k+1)$$

(

k

+

1

)

-data reduction rule in parameterised algorithmics, embedded in a sophisticated branch-&-bound algorithm, improves over the performance of existing algorithms based on Integer Linear Programming (ILP) and branch-&-bound. Furthermore, our version of the

$$(k+1)$$

(

k

+

1

)

-rule outperforms the theoretically most effective preprocessing algorithm, which yields a 2

k

-vertex kernel. Notably, this 2

k

-vertex kernel is analysed empirically for the first time here. Our new algorithm was developed by integrating Programming by Optimisation into the classical algorithm engineering cycle – an approach which we expect to be successful in many other contexts.

This paper introduces an automated approach called OSCAR that combines algorithm portfolios and online algorithm selection. The goal of algorithm portfolios is to construct a subset of algorithms with diverse problem solving capabilities. The portfolio is then used to select algorithms from for solving a particular (set of) instance(s). Traditionally, algorithm selection is usually performed in an offline manner and requires the need of domain knowledge about the target problem; while online algorithm selection techniques tend not to pay much attention to a careful construction of algorithm portfolios. By combining algorithm portfolios and online selection, our hope is to design a problem-independent hybrid strategy with diverse problem solving capability. We apply OSCAR to design a portfolio of memetic operator combinations, each including one crossover, one mutation and one local search rather than single operator selection. An empirical analysis is performed on the Quadratic Assignment and Flowshop Scheduling problems to verify the feasibility, efficacy, and robustness of our proposed approach.

A simple model shows how a reasonable update scheme for the probability vector by which a hyper-heuristic chooses the next heuristic leads to neglecting useful mutation heuristics. Empirical evidence supports this on the

MaxSat

,

TravelingSalesman

,

PermutationFlowshop

and

VehicleRoutingProblem

problems. A new approach to hyper-heuristics is proposed that addresses this problem by modeling and learning hyper-heuristics by means of a hidden Markov Model. Experiments show that this is a feasible and promising approach.

Differential evolution (DE) is a powerful and simple algorithm for single- and multi-objective optimization. However, its performance is highly dependent on the right choice of parameters. To mitigate this problem, mechanisms have been developed to automatically control the parameters during the algorithm run. These mechanisms are usually a part of a unified DE algorithm, which makes it difficult to compare them in isolation. In this paper, we go through various deterministic, adaptive, and self-adaptive approaches to parameter control, isolate the underlying mechanisms, and apply them to a single, simple differential evolution algorithm. We observe its performance and behavior on a set of benchmark problems. We find that even the simplest mechanisms can compete with parameter values found by exhaustive grid search. We also notice that

self-adaptive

mechanisms seem to perform better on problems which can be optimized with a very limited set of parameters. Yet,

adaptive

mechanisms seem to behave in a problem-independent way, detrimental to their performance.

This paper draws on three different sets of ideas from computer science to develop a self-learning system capable of delivering an obstacle avoidance decision tree for simple mobile robots. All three topic areas have received considerable attention in the literature but their combination in the fashion reported here is new. This work is part of a wider initiative on problems where human reasoning is currently the most commonly used form of control. Typical examples are in sense and avoid studies for vehicles – for example the current lack of regulator approved sense and avoid systems is a key road-block to the wider deployment of uninhabited aerial vehicles (UAVs) in civil airspaces.

The paper shows that by using well established ideas from logic circuit design (the “

espresso

” algorithm) to influence genetic programming (GP), it is possible to evolve well-structured case-based reasoning (CBR) decision trees that can be used to control a mobile robot. The enhanced search works faster than a standard GP search while also providing improvements in best and average results. The resulting programs are non-intuitive yet solve difficult obstacle avoidance and exploration tasks using a parsimonious and unambiguous set of rules. They are based on studying sensor inputs to decide on simple robot movement control over a set of random maze navigation problems.

-hard parallel machine scheduling problem that has received much attention in literature due to its direct application to real-world applications. For solving this problem, we present a variable neighbourhood search that incorporates a learning mechanism for guiding the search. Computational results comparing with the best approaches for this problem reveals that our algorithm is a suitable alternative to efficiently solve this problem.

Two different strategies for searching a best-available service in adaptive, open software systems are simulated. The practical advantage of the theoretically optimal strategy is confirmed over a ‘trivial choice’ approach, however the advantage was only small in the simulation.

This paper compares novel self-adaptive Differential Evolution algorithms (SADEs) on noisy test functions to see how robust the algorithms are against noise in fitness function. This paper also compares the performance of SADEs on real-world problems that estimates Bidirectional Reflectance Distribution Function properties of 3D objects.

This paper presents an analysis of different possible operators for local search algorithms in order to solve permutation-based problems. These operators can be defined by a distance metric that define the neighborhood of the current configuration, and a selector that chooses the next configuration to be explored within this neighborhood. The performance of local search algorithms strongly depends on their ability to efficiently explore and exploit the search space. We propose here a methodological approach in order to study the properties of distances and selectors in order to buildtheir performances operators that can be used either for intensification of the search or for diversification stages. Based on different observations, this approach allows us to define a simple generic hyperheuristic that adapt the choice of its operators to the problem at hand and that manages their use in order to ensure a good trade-off between intensification and diversification. Moreover this hyperheuristic can be used on different permutation-based problems.

Because permutation problems are particularly challenging to model and optimise, the possibility to represent solutions by means of factoradics has recently been investigated, allowing algorithms from other domains to be used. Initial results have shown that methods using factoradics can efficiently explore the search space, but also present difficulties to exploit the best areas. In the present paper, the fitness landscape of the factoradic representation and one of its simplest operator is studied on the Permutation Flowshop Scheduling Problem (PFSP). The analysis highlights the presence of many local optima and a high ruggedness, which confirms that the factoradic representations is not suited for local search. In addition, comparison with the classic permutation representation establishes that local moves on the factoradic representation are less able to lead to the global optima on the PFSP. The study ends by presenting directions for using and improving the factoradic representation.

In this paper, we present a generic local search algorithm which artificially adds neutrality in search landscapes by discretizing the evaluation function. Some experiments on NK landscapes show that an adaptive discretization is useful to reach high local optima and to launch diversifications automatically. We believe that a hill-climbing using such an adaptive evaluation function could be more appropriated than a classical iterated local search mechanism.

The Covering Tour Problem finds application in distribution network design. It includes two types of vertices: the covering ones and the ones to be covered. This problem is about identifying a lowest-cost Hamiltonian cycle over a subset of the covering vertices in such a way that every element not of this type is covered. In this case, a vertex is considered covered when it is located within a given distance from a vertex in the tour. This paper presents a solution procedure based on a

Selector

operator that allows to convert a giant tour into an optimal CTP solution. This operator is embedded in an adaptive large neighborhood search. The method is competitive as shown by the quality of results evaluated using the output of a state-of-the-art exact algorithm.

We introduce a new problem arising in small and medium-sized container terminals: the

Two-Dimensional Pre-Marshalling Problem

(2D-PMP). It is an extension of the well-studied Pre-Marshalling Problem (PMP) that is crucial in container storage. The 2D-PMP is particularly challenging due to its complex side constraints that are challenging to express and difficult to consider with standard techniques for the PMP. We present three different heuristic approaches for the 2D-PMP. First, we adapt an existing construction heuristic that was designed for the classical PMP. We then apply this heuristic within two metaheuristics: a Pilot method and a Max-Min Ant System that incorporates a special pheromone model. In our empirical evaluation we observe that the Max-Min Ant System outperforms the other approaches by yielding better solutions in almost all cases.

We investigate per-instance algorithm selection techniques for solving the Travelling Salesman Problem (TSP), based on the two state-of-the-art inexact TSP solvers, LKH and EAX. Our comprehensive experiments demonstrate that the solvers exhibit complementary performance across a diverse set of instances, and the potential for improving the state of the art by selecting between them is significant. Using TSP features from the literature as well as a set of novel features, we show that we can capitalise on this potential by building an efficient selector that achieves significant performance improvements in practice. Our selectors represent a significant improvement in the state-of-the-art in inexact TSP solving, and hence in the ability to find optimal solutions (without proof of optimality) for challenging TSP instances in practice.

The Multiple Knapsack Assignment Problem (MKAP) is an extension of the Multiple Knapsack Problem, a well-known

$$\mathcal {NP}$$

NP

-hard combinatorial optimization problem. The MKAP is a hard problem even for small-sized instances. In this paper, we propose an approximate approach for the MKAP based on a biased random key genetic algorithm. Our solution approach exhibits competitive performance when compared to the best approximate approach reported in the literature.

In this article, we propose to apply a hybrid method called DYNAMOP (DYNAmic programming using Metaheuristic for Optimization Problems) to solve the Unit Commitment Problem (UCP). DYNAMOP uses a representation based on a path in the graph of states of dynamic programming, which is adapted to the dynamic structure of the problem and facilitates the hybridization between evolutionary algorithms and dynamic programming. Experiments indicate that the proposed approach outperforms the best known approach in literature.

) focuses on sampling the arms that are most probable to be misclassified (i.e., optimal or suboptimal arms) in order to identify the set of best arms aka the Pareto front. Our scalarized multi-objective LUCB (

sMO-LUCB

) is an adaptation of LUCB to reward vectors. Preliminary empirical results show good performance of the proposed algorithm on a bi-objective environment.

A supervised learning approach to generating composite linear priority dispatching rules for scheduling is studied. In particular we investigate a number of strategies for how to generate training data for learning a linear dispatching rule using preference learning. The results show, that when generating a training data set from only optimal solutions, it is not as effective as when suboptimal solutions are added to the set. Furthermore, different strategies for creating preference pairs is investigated as well as suboptimal solution trajectories. The different strategies are investigated on 2000 randomly generated problem instances using two different problem generator settings.

In this paper, we present a practical case of the multiobjective knapsack problem which concerns the elaboration of the optimal action plan in the social and medico-social sector. We provide a description and a formal model of the problem as well as some preliminary computational results. We perform an empirical analysis of the behavior of three metaheuristic approaches: a fast and elitist multiobjective genetic algorithm (NSGA-II), a Pareto Local Search (PLS) algorithm and an Indicator-Based Multi-Objective Local Search (IBMOLS).

This paper addresses the problem of derivative-free multi-objective optimization of real-valued functions under multiple inequality constraints. Both the objective and constraint functions are assumed to be smooth, nonlinear, expensive-to-evaluate functions. As a consequence, the number of evaluations that can be used to carry out the optimization is very limited. The method we propose to overcome this difficulty has its roots in the Bayesian and multi-objective optimization literatures. More specifically, we make use of an extended domination rule taking both constraints and objectives into account under a unified multi-objective framework and propose a generalization of the expected improvement sampling criterion adapted to the problem. A proof of concept on a constrained multi-objective optimization test problem is given as an illustration of the effectiveness of the method.

A method to generate various size tunable benchmarks for multi-objective AI planning with a known Pareto Front has been recently proposed in order to provide a wide range of Pareto Front shapes and different magnitudes of difficulty. The performance of the Pareto-based multi-objective evolutionary planner

DaE

$$_{\text {YAHSP}}$$

YAHSP

are evaluated on some large instances with singular Pareto Front shapes, and compared to those of the single-objective aggregation-based approach.

When searching for a maximum clique of a graph using a branch-and-bound algorithm, it is usually believed that one should minimize the set of branching vertices from which search is necessary. It this paper, we propose an approach called incremental MaxSAT reasoning to reduce the set of branching vertices in three ways, developing three algorithms called DoMC (short for Dynamic ordering MaxClique solver), SoMC and SoMC- (short for Static ordering MaxClique solver), respectively. The three algorithms differ only in the way to reduce the set of branching vertices. To our surprise, although DoMC achieves the smallest set of branching vertices, it is significantly worse than SoMC and SoMC-, because it has to change the vertex ordering for branching when reducing the set of branching vertices. SoMC is the best, because it preserves the static vertex ordering for branching and reduces the set of branching vertices more than SoMC-.

In this paper we present a new approach to reduce the computational time spent on coloring in one of the recent branch-and-bound algorithms for the maximum clique problem. In this algorithm candidates to the maximum clique are colored in every search tree node. We suggest that the coloring computed in the parent node is reused for the child nodes when it does not lead to many new branches. So we reuse the same coloring only in the nodes for which the upper bound is greater than the current best solution only by a small value

This works extends the Random Embedding Bayesian Optimization approach by integrating a warping of the high dimensional subspace within the covariance kernel. The proposed warping, that relies on elementary geometric considerations, allows mitigating the drawbacks of the high extrinsic dimensionality while avoiding the algorithm to evaluate points giving redundant information. It also alleviates constraints on bound selection for the embedded domain, thus improving the robustness, as illustrated with a test case with 25 variables and intrinsic dimension 6.

The global optimization of expensive-to-calculate continuous functions is of great practical importance in engineering. Among the proposed algorithms for solving such problems,

Efficient Global Optimization (EGO)

and

Covariance Matrix Adaptation Evolution Strategy (CMA-ES)

are regarded as two state-of-the-art unconstrained continuous optimization algorithms. Their underlying principles and performances are different, yet complementary: EGO fills the design space in an order controlled by a Gaussian process (GP) conditioned by the objective function while CMA-ES learns and samples multi-normal laws in the space of design variables. This paper proposes a new algorithm, called EGO-CMA, which combines EGO and CMA-ES. In EGO-CMA, the EGO search is interrupted early and followed by a CMA-ES search whose starting point, initial step size and covariance matrix are calculated from the already sampled points and the associated conditional GP. EGO-CMA improves the performance of both EGO and CMA-ES in our 2 to 10 dimensional experiments.

is dedicated to clustering. Indeed, it is well-known that clustering may be seen as a multi-objective optimization problem as the goal is both to minimize distances between data belonging to a same cluster, while maximizing distances between data belonging to different clusters. In this paper we present the framework as well as experimental results, to attest the benefit of using multi-objective approaches for clustering.

Many real world problems can be solved effectively by metaheuristics in combination with neighbourhood search. However, implementing neighbourhood search for a particular problem domain can be time consuming and so it is important to get the most value from it. Hyper-heuristics aim to get such value by using a specific API such as ‘HyFlex’ to cleanly separate the search control structure from the details of the domain. Here, we discuss various longer-term additions to the HyFlex interface that will allow much richer information exchange, and so enhance learning via data science techniques, but without losing domain independence of the search control.