Publications

You can find an updated list of my articles on my Google Scholar profile.

Quality with Just Enough Diversity in Evolutionary Policy Search

Authors: Paul Templier, Luca Grillotti, Emmanuel Rachelson, Dennis G. Wilson, Antoine Cully

Published in GECCO 2024 - Best Paper Award, 2024

Evolution Strategies (ES) are effective gradient-free optimization methods that can be competitive with gradient-based approaches for policy search. ES only rely on the total episodic scores of solutions in their population, from which they estimate fitness gradients for their update with no access to true gradient information. However this makes them sensitive to deceptive fitness landscapes, and they tend to only explore one way to solve a problem. Quality-Diversity methods such as MAP-Elites introduced additional information with behavior descriptors (BD) to return a population of diverse solutions, which helps exploration but leads to a large part of the evaluation budget not being focused on finding the best performing solution. Here we show that behavior information can also be leveraged to find the best policy by identifying promising search areas which can then be efficiently explored with ES. We introduce the framework of Quality with Just Enough Diversity (JEDi) which learns the relationship between behavior and fitness to focus evaluations on solutions that matter. When trying to reach higher fitness values, JEDi outperforms both QD and ES methods on hard exploration tasks like mazes and on complex control problems with large policies.

Genetic Drift Regularization: on preventing Actor Injection from breaking Evolution Strategies

Authors: Paul Templier, Emmanuel Rachelson, Antoine Cully, Dennis G. Wilson

Published in IEEE CEC 2024, 2024

Evolutionary Algorithms (EA) have been successfully used for the optimization of neural networks for policy search, but they still remain sample inefficient and underperforming in some cases compared to gradient-based reinforcement learning (RL). Various methods combine the two approaches, many of them training a RL algorithm on data from EA evaluations and injecting the RL actor into the EA population. However, when using Evolution Strategies (ES) as the EA, the RL actor can drift genetically far from the the ES distribution and injection can cause a collapse of the ES performance. Here, we highlight the phenomenon of genetic drift where the actor genome and the ES population distribution progressively drift apart, leading to injection having a negative impact on the ES. We introduce Genetic Drift Regularization (GDR), a simple regularization method in the actor training loss that prevents the actor genome from drifting away from the ES. We show that GDR can improve ES convergence on problems where RL learns well, but also helps RL training on other tasks, , fixes the injection issues better than previous controlled injection methods.

Searching Search Spaces: Meta-evolving a Geometric Encoding for Neural Networks

Authors: Tarek Kunze, Paul Templier, Dennis G. Wilson

Published in IEEE CEC 2024, 2024

In evolutionary policy search, neural networks are usually represented using a direct mapping: each gene encodes one network weight. Indirect encoding methods, where each gene can encode for multiple weights, shorten the genome to reduce the dimensions of the search space and better exploit permutations and symmetries. The Geometric Encoding for Neural network Evolution (GENE) introduced an indirect encoding where the weight of a connection is computed as the (pseudo-)distance between the two linked neurons, leading to a genome size growing linearly with the number of genes instead of quadratically in direct encoding. However GENE still relies on hand-crafted distance functions with no prior optimization. Here we show that better performing distance functions can be found for GENE using Cartesian Genetic Programming (CGP) in a meta-evolution approach, hence optimizing the encoding to create a search space that is easier to exploit. We show that GENE with a learned function can outperform both direct encoding and the hand-crafted distances, generalizing on unseen problems, and we study how the encoding impacts neural network properties.

LUCIE: An Evaluation and Selection Method for Stochastic Problems

Authors: Erwan Lecarpentier, Paul Templier, Emmanuel Rachelson, Dennis G. Wilson

Published in GECCO 2022, 2022

Selection in genetic algorithms is difficult for stochastic problems due to noise in the fitness space. Common methods to deal with this fitness noise include sampling multiple fitness values, which can be expensive. We propose LUCIE, the Lower Upper Confidence Intervals Elitism method, which selects individuals based on confi- dence. By focusing evaluation on separating promising individuals from others, we demonstrate that LUCIE can be effectively used as an elitism mechanism in genetic algorithms. We provide a theoretical analysis on the convergence of LUCIE and demonstrate its ability to select fit individuals across multiple types of noise on the OneMax and LeadingOnes problems. We also evaluate LUCIE as a selection method for neuroevolution on control policies with stochastic fitness values.

A Geometric Encoding for Neural Network Evolution

Authors: Paul Templier, Emmanuel Rachelson, Dennis G. Wilson

Published in GECCO 2021, 2021

A major limitation to the optimization of artificial neural networks (ANN) with evolutionary methods lies in the high dimensionality of the search space, the number of weights growing quadratically with the size of the network. This leads to expensive training costs, especially in evolution strategies which rely on matrices whose sizes grow with the number of genes. We introduce a geometric encoding for neural network evolution (GENE) as a representation of ANN parameters in a smaller space that scales linearly with the number of neurons, allowing for efficient parameter search. Each neuron of the network is encoded as a point in a latent space and the weight of a connection between two neurons is computed as the distance between them. The coordinates of all neurons are then optimized with evolution strategies in a reduced search space while not limiting network fitness and possibly improving search.

Evolving a Dota 2 bot: Illuminating search in CGP and NEAT

Authors: Paul Templier, Lucas Hervier, Dennis G. Wilson

Published in Competition at GECCO 2020, 2020

In this work we present an evolution-based approach applied to Dota 2 in the Project Breezy challenge. The goal of this project is to train an agent to play a 1v1 Midlane match against the game's bots of varrying difficulties, with both sides playing Shadow Fiend. The approach we implemented relies on the MAP-elites algorithm assisted with a neural-based simulator of the game to increase behavior diversity and reduce computation load, using CGP agents or NEAT networks as individuals.