24 Mar 2016

Hello world- the genetic way

This isn't going to be a mind-blowing post, but it was a fun little experiment to code up. Essentially, I am going to present the 'hello world' of genetic algorithms. And I mean this literally: I am going to initialise chromosones that hold random strings of characters, and use a genetic algorithm to converge on my desired message: 'Hello World'.

For those of you who are interested in seeing my implementation it is here, but as this is such a fun, straightforward example I would suggest implementing your own version, should you be interested! Also, note that this is Python 3- I'm trying to get with the times, and avoid Python 2 where possible.

Overview

Genetic algorithms have been around for a while, and have certainly been applied to applications significantly cooler than what I am about to do. They are a pretty nifty type of optimisation algorithm that models natural selection to find an optimial solution. While they may not be as efficient as standard methods (gradient descent or more funky variants such as BFGS) for certain classes of problems, they can be applied to functions that don't have continuous derivatives. In machine learning, they can be applied to perform automatic feature selection; that is, you can use them to determine which set of inputs should be included to build the best regression or classification model.

So whats the basic idea? You need to firstly define a cost (or 'fitness') function. In my case, I am going to define an 11 character string. The desired result is the string 'Hello World', so I am going to define the cost function to be a measure of the deviation of the string from this value. Python has some handy functions to do this: for example the str module has a function ord() that converts ascii characters to an ordered numeric representation. Therefore, I can define the cost quite simply by comparing it to the desired solution.

Now, lets think about representations. The basic strategy with a genetic algorithm is to consider a population of chromosones. Each of these chromosones is initialised randomly, and can be assessed for fitness using the cost function. In my case, each object of class Chromosone has an attribute that is an 11 character string (initialised randomly), and therefore the fitness can be assessed as previously discussed. The entire population is assessed, and then a 'breeding' phase occurs. In my simple model, I choose the top 20 'fittest' chromosones, and breed them by randomly selecting and combining them together. I repeat this until I have 100 new chromosones.

The final step is to allow a mutation (typically with a very small probability). This step is essential to ensure that during the optimisation process the solution doesnt get caught in a local (rather than the global) minimum. The mutation probability is a tuning paramater that can be tweaked, but we won't dwell on that for now. In practice, I will select chromosones with a small probability, and those selected will have one of the characters changed by sampling (at random) a new character.

And then we repeat! The survival of the fittest means that the best solutions are combined to (hopefully) produce even better solutions, and then this process is run over a number of generations. In this case I know that by design a perfect fitness is possible, so I will stop once I have a Chromosone object with a value 'Hello World'.

Well, that's the basic idea: below is some discussion around the implementation.

Chromosone Class and Methods

The chromosone class is conceptually straightforward; it just needs to be a means to hold a string. The first thing I do is define __init__ and __str__ methods- this is good practice. The initialisation will randomly initialise the string of a defined length using the string module and a handy list comprehension. The value attribute of the object of class Chromosone is this string, and the string representation method simply returns this value.

Two other methods are needed. Firstly, a cost method, which calculates the fitness of the chromosone by comparison with the objective (that is, the desired string). Rememember, when the cost is zero, we have arrived at the desired solution. Secondly, we need a combine method- this acts on self, and takes another Chromosone object and a position as arguments. It modifies self, returning a Chromosone object that is the result of comining self and other, cut at the defined position.

The Chromosone class and associated methods are therefore defined with the following syntax:

class Chromosone:
  '''
  Represents a chromosone
  '''
  def __init__(self, N):
    '''
    initialisation method for Chromosone.
    is a string of length N
    '''
    tmp = ''.join(
      random.SystemRandom().choice(
      string.ascii_uppercase +
      string.ascii_lowercase +
      string.digits +
      " "
    ) for _ in range(N))
    self.value = tmp


  def __str__(self):
    '''
    representation of Chromosone instance.
    just print string.
    '''
    return(self.value)


  def cost(self, objective):
    '''
    method to calculate cost.
    in this case, other should be a chromosone
    that represents the desired solution
    '''
    cost = 0
    for i in range(len(objective)):
      cost += (ord(objective[i]) - ord(self.value[i])) ** 2
    return(cost)


  def combine(self, other, position):
    '''
    method to combine two chromosones
    at specified position
    remember is not pure so it modifies self
    '''
    self.value = self.value[:position] + other.value[position:]

The __init__ method builds a random string of letters of length N, which is stored in the value attribute of the object. The string and random modules make this process fairly straightforward. The remaining methods are fairly simple, and should be self explanatory.

Population Class and Methods

A Population object is essentially just a collection of Chromosones objects. The idea is to initialise the first generation of Chromosones, and then be able to select the fittest N Chromosones, breed them, and then mutate some of the children with a small probability. This process is repeated, gradually improving the fitness of the Chromosones in the Population, until we hopefully converge on the desired solution.

The required class definition and syntax is:

class Population:
  '''
  represents a population of chromosones
  '''
  mutate_prob = None
  pop_size = None
  objective = None


  def init(self, objective, pop_size = 100, mutate_prob = 0.02):
    self.objective = objective
    self.pop_size = pop_size
    self.mutate_prob = mutate_prob
    self.pool = []
    for i in range(self.pop_size):
      tmp = Chromosone(len(objective))
      self.pool.append(tmp)


  def str(self):
    res = []
    for chromosone in self.pool:
      res.append(
        str(chromosone) +
        " " +
        str(chromosone.cost(self.objective))
      )
    return('\n'.join(res) + '\n')


  def order(self):
    '''
    order method: arrange chromosones in pool from
    best to worst
    '''
    fitness = dict()
    for chromosone in self.pool:
      fitness[chromosone] = chromosone.cost(self.objective)

    tmp = list()
    for k, v in fitness.items():
      tmp.append((v, k))

    tmp.sort(key = lambda x: x[0])
    # now can order best to worst
    for i in range(self.pop_size):
      self.pool[i] = tmp[i][1]


  def breed(self, topN = 20):
    '''
    breed method: relies on chromosone combine
    method.
    '''
    tmp = copy.copy(self.pool[:(topN - 1)])
    #
    for i in range(self.pop_size):
      random.shuffle(tmp)
      parent1 = copy.copy(tmp[0])
      parent2 = copy.copy(tmp[1])
      split = random.randint(1, (len(self.objective) - 1) )

      parent1.combine(parent2, split)
      self.pool[i] = copy.copy(parent1)


  def mutate(self):
    '''
    mutate method: if uniform random number less than
    mutat_prob, we will select one index at random
    and replace with random string from character set
    '''
    for i in range(self.pop_size):
      if(random.uniform(0, 1) < self.mutate_prob):
        index = random.randrange(len(self.objective))
        tmp = copy.copy(self.pool[i])
        stringList = list(tmp.value)
        stringList[index] = random.SystemRandom().choice(
          string.ascii_uppercase +
          string.ascii_lowercase +
          string.digits +
          " "
        )
        self.pool[i].value = ''.join(stringList)

I'm fully aware that there is a lot of copying involved; I wouldn't claim that this is the most efficient implementation ever written. But that isn't the point: it works, and I'm only coding this up to play around. I highly doubt I would ever use this code in a serious implementation: the purpose is to get a good feeling about how genetic algorithms work.

Results

Well, that was fairly straightforward, and really intuitive. The last piece of the puzzle is to simply initiate a population, then order, breed, and mutate until convergence! I define main() with the below syntax; it is a just a bit of looping combined with some writing output to terminal (which I save to a file for plotting).

def main():
  objective = ("Hello World")
  pop = Population(objective)

  print('# gen   cost     value')
  for i in range(5001):
    if(pop.pool[0].cost(objective) == 0):
      print('{:5d} {:6d}     {}'.format(
        i, pop.pool[0].cost(objective), pop.pool[0])
      )
      break

    if (i % 10) == 0:
      print('{:5d} {:6d}     {}'.format(
        i, pop.pool[0].cost(objective), pop.pool[0])
      )

    pop.breed()
    pop.mutate()
    pop.order()

I've knocked up a simple plot using one of my favorite plotting programs, gnuplot. I have included the plotting script in the github repository. We can plot out the cost as a function of iteration, and get something like the below plot (it will vary due to the random nature of the problem- I have been a little lazy and not set random seeds).

genetic_algorithm"

If you run the code yourself, you can see the message printed to screen gradually converges upon the desired message (Hello World). We see the improvement is initially very rapid, and then slows down, taking about 800 iterations (generations) to get the last letter correct. Remember, a cost of zero corresponds to the best Chromosone in the Population reaching the desired message. If you were interested and wanted to play around, you can vary the mutation rate, population size, and even the definition of the cost function to see how these can improve the rate of convergence.

Other thoughts

Well I hope this little experiment proves to be an interesting read- it's ridiculously simple, but hopefully it provides an intuitive understanding of how this class of optimiser works, even if the problem is a little contrived.

Genetic algorithms can of course be applied to far more useful problems. For example, in machine learning they are often turned to when automatic feature selection is desired; in this case your chromosone may consist of binary 0/1's to indicate whether the feature should be included or not, and your cost function is likely to be a suitable metric for model performance. Genetic algorithms can also be turned to for other situations, such as optimisting awkwards functions (i.e. those with discontinuous gradients). They will never stand up to methods based around gradient descent for simple optimisations, but when used for appropriate classes of problem, they can prove to be highly useful.

TL;DR- I wrote a simple example of how a genetic algorithm can be used for optimisation with a 'hello world' problem. If you are interested, the code is here.