AI_grow.md



AI that Grows
Research and development of workflows for the co-design reconfigurable AI software and hardware.

Weight Agnostic Neural Networks (WANN)

Introduction


Paper

"focus on finding minimal architectures".
"By deemphasizing learning of weight parameters, we encourage the agent instead to develop ever-growing networks that can encode acquired skills based on its interactions with the environment".


Case Study: Cart-Pole Swing Up

One of the most famous benchmarks of non-linear control, there is lots of approaches including standard q-learning using a discretized state space, deep Q-learning or linear Q-learning with continuos state space.
WANN is interesting as it tries to get the simplest network that uses the input sensors (position, rotation and their derivatives) to the output (force). It focuses on learning principles and not only tune weights.

This is one of the outputs of the network and you can see because of it's simplicity it's not a black box and one can deduce the principles learnt 1(https://towardsdatascience.com/weight-agnostic-neural-networks-fce8120ee829):

the position parameter is almost directly linked to the force, there is only an inverter which means that if the cart is on right or left of the center (+- x), it always try to go to in the opposite direction to the center.
Based on the weight (shared weight between them all) it learned that one inverter is not enough, so it doubled it.
It shows that it discovered symmetry: Most of them pass by a gaussian filter which basically gives the same result for -x and x, which means it's agnostic to sign of the input.


Example Implementation: Frep Search
As a first step to understand the code and WANN training, I implemented a toy problem where I am trying to learn the functional representation of target image, which is, given the x,y position of every pixel in an image, try to find the distance function that represent how far is this pixel from the edge of the shape.
In the following training the target shape is a circle, and the input is the x and y positions of the pixels, and it found a minimal neural network architecture (given a library of given non linear functions) that maps the input position into the target shape.
Graph Evolution: