NGE

Neural Graph Evolution Towards Efficient Automatic Robot Design
Tingwu Wang* Yuhao Zhou*, Sanja Fidler, Jimmy Ba {tingwuwang, henryzhou, fidler, jba} @ cs.toronto.edu University of Toronto, Canada Vector Institute Despite the recent successes in robotic locomotion control, the design of robot relies heavily on human engineering. Automatic robot design has been a long studied subject, but the recent progress has been slowed due to the large combinatorial search space and the difficulty in evaluating the found candidates. To address the two challenges, we formulate automatic robot design as a graph search problem and perform evolution search in graph space. We propose Neural Graph Evolution (NGE), which performs selection on current candidates and evolves new ones iteratively. Different from previous approaches, NGE uses graph neural networks to parameterize the control policies, which reduces evaluation cost on new candidates with the help of skill transfer from previously evaluated designs. In addition, NGE applies Graph Mutation with Uncertainty (GM-UC) by incorporating model uncertainty, which reduces the search space by balancing exploration and exploitation. We show that NGE significantly outperforms previous methods by an order of magnitude. As shown in experiments, NGE is the first algorithm that can automatically discover kinematically preferred robotic graph structures, such as a fish with two symmetrical flat side-fins and a tail, or a cheetah with athletic front and back legs. Instead of using thousands of cores for weeks, NGE efficiently solves searching problem within a day on a single 64 CPU-core Amazon EC2 machine.


Paper
Citation: Tingwu Wang, Yuhao Zhou, Sanja Fidler, and Jimmy Ba Neural Graph Evolution: Towards Efficient Automatic Robot Design Paper PDF [9.5 MB] \| OpenReview \| Codes (* denotes equal contribution)
Bib format @inproceedings{ wang2018neural, title={Neural Graph Evolution: Automatic Robot Design}, author={Tingwu Wang and Yuhao Zhou and Sanja Fidler and Jimmy Ba}, booktitle={International Conference on Learning Representations}, year={2019}, url={https://openreview.net/forum?id=BkgWHnR5tm}, }
Neural Graph Evolution
We highlight the novelty of the algorithm in the following points. Toggle each part to learn more. An overview of the algorithm is shown combining all the points. For a more detailed explanation, please refer to the paper. Evolutionary Algorithm Designing a robot structure is not a differentiable problem. We used an evolutionary algorithm to optimize the most fitted structure of an agent in a given environment based on the reward. Graph Mutation on Species Following the convention of robot design, we model the structure of agents using a graph. To make the problem suitable in the context of evolutionary algorithm, we designed various mutation methods subject to graphs. Policy Sharing across Generations In order to speed up the training process, we aimed to re-use policy trained on old species. Herein, we utilized the transferability of Graph Neural Network (GNN) by sharing the weights trained on previous species’ structure. Amortized Fitness Previous work on neural architecture search separated the process of training and evaluation and thus required large computational resources. We introduced Amortized fitness to combine the two processes so that the algorithm is less computationally demanding. Bayesian Pruning for Species Selection We introduced a pruning method using Thompson sampling to eliminate structures that are less-promising. In this way, computational resources are allocated to those structures that are potentially more fit in the environment.

Quantitative Result
We created two customized environment to benchmark our algorithm. The first is an environment without gravity to simulate the environment of water. The other is an environment with gravity to simulate land. Both of the environments have the reward that encourages the agent to move faster in a fixed direction. The topic studying the design of structures of Reinforcement Learning is under-explored. We provided a total of 5 baselines to benchmark our algorithm. The baselines range from naive evolutionary structure search to modern variants of relevant works. From the curves (Left - Environment without gravity and Right - Environment with gravity), we can see that our algorithm performs significantly better comparing to other methods.

Qualitative Result Demo
	We here present the qualitative result of our algorithm running in the first environment. The texts at the bottom-left corner has the following meaning: "Generation" is the generation (outer-loop) of the evolutionary algorithm; "Reward" is the amortized fitness. In this case, it is the speed of the fish in a fixed direction. The sideview of the agents' animation is presented.
We highlight some representative species occurred in the genealogy tree. In the early generation, the agent started with randomly generated body parts. It was not making much progress due to the unbalanced body parts. It quickly learnt to get rid of side fins and instead used the back-fin to paddle in order to swim faster. As the agent was being trained, it gradually learnt that symmetric side fins, which originally was a small body part, performed better. Eventually, the agent learnt its corresponding controller given by structure it evolved, which matched with the kinematic phenomenon found in nature. An additional study of finetuning the body parts in the environment of 'cheetah' is also presented. In this experiment, the topological structure of the agents is fixed, whereas the attributes of each body part is allowed to change. Notice that the front leg of the cheetah is evolved into a craw-like structure to strike the ground.
We study how symmetric structures were generated by visualizing the genealogy tree of the species. For visualization purposes, we pruned the genealogy tree to highlight the significant development for one species. Most notably, the entire species were eventually dominated by those kinematically-fit individuals. Those that don’t possess a good structure (lopsided and minimal structure) will cease to exist due to the competitiveness from fitted structure. The genealogy tree of species trained in the second environment is attached below.

Acknowledgement
The research is supported by NSERC. We thank NVIDIA for GPU donation. We thank Relu Patrascu for infrastructure support.
Credits: University of Toronto, Vector Institute: Yuhao Zhou, Tingwu Wang, Sanja Fidler, Jimmy Ba

Neural Graph Evolution Towards Efficient Automatic Robot Design

Paper

Neural Graph Evolution

Quantitative Result

Qualitative Result Demo

Acknowledgement

Neural Graph Evolution
Towards Efficient Automatic Robot Design