10 February 2012

The Genetic Algorithm

Genetic algorithm is one of many techniques in optimization which is based on the principles of genetic and natural selection developed by John Holland (1975)[5,17]. GA optimization technique is inspired by the theories in the biological sciences. According to the name, the processes that occur in GA are similar to what happened in the real biological evolution. In biological science, a group of species is called population. In reality there is not only one population exist, many populations of species that can continue to proliferate, as well as in GA. GA will produce number of population to some predetermined limit. In a long time, there may be a forming of a new species or also called speciation. In this case, a gradual change of heredity that makes up the new traits in that species.

The most important concept is heredity, which is an idea that individual properties can be encoded in a certain way so that the properties can be inherited to the next generation. Each individual brings a genome that contains several chromosomes in the form of DNA molecules. Each chromosome contains a number of genes, where the units of heredity and coding information necessary to build and maintain an individual.

Important concept in the theory of evolution is fitness and selection for reproduction. In the real world there are two ways of reproduction, sexual and asexual reproduction. On sexual reproduction the chromosomes from two individuals (as parents) are combined to produce a new individual. The chromosomes in the new individual are taken from the first parent and also the second. This is called crossover. Sometimes gene copying from the parent can result in an error, and this error is called mutation. Whereas, in asexual reproduction, only one individual parents who paid attention, so there is no crossover process. But mutation process also may occur in asexual reproduction[6].

This method is selected to solve VRP because:
a. GA can handle a problem with a huge solution space (like VRP), which is not possible to find solution with linear programming in relevant time.
b. GA can handle multi-constrained problems, with lots of constraints.
c.  GA has time limit or a resource limit of the problem, so they can be stopped in any time and gives a result for the problem (although the results are still not the optimal one).
d. GA can handle the problem which does not have exact or known algorithm.

Besides the advantages of GA which already explained above, there are also some disadvantages which are:
a. GA shows a very fast initial convergence, but followed by progressive slower improvements, local optimization can handle this problem.
b. Linear programming and brute force using simple model but requiring huge computation time, while GA has a complex model that can be completed more quickly. But models with many parameters are computationally expensive.
c. Because it is randomly evaluated, there are some possibilities that the calculation is trapped in local optimum area.
d. The fitness of all models may be similar, so the convergence is slow.

Components of Genetic Algorithm
The standard GA components consist of [6]:
1.      Fitness Function
Fitness function determines the quality of each individual. If the individual has a better fitness value, it can become parents for the next generation in order to produce a better offspring. Fitness function is determined by the optimization goal, whether it is a minimum or maximum optimization. Unlike the objective function, the resulting value from fitness function is always changing as long as it founds a better value in the next iteration while the objective function value is the best final value of all performed iterations.

2.      A population with N chromosome (individual)
Population is a collection of individuals or chromosomes that carry a specific value inside the genes for an optimization problem. Population may change continuously due to the multiple operations such as crossover and mutation. Population formed at the end contains individuals/chromosomes which hopefully has the highest fitness value.
In GA individuals or chromosomes are number of solutions taken from its solution space. Unlike Brute Force or Linear Programming method that evaluate the entire solution in the solution space, GA only evaluates some parts in solution space which randomly generated in population. Although the individual evaluation is randomly performed, GA has parameters that can lead the random solution generations into the area which has better fitness value in the solution space.

3.      Selection and Inheritance
The selection process is performed to find good individuals which will be used as a parent. By choosing the best individuals, the resulting offspring is expected to have better fitness value that both of its parent. There existed several mthods to select the parents, some of them are:

Roulette-wheel Selection
The idea behind the roulette wheel selection technique is that each individual is given a chance to become a parent in proportion to its fitness. It is called roulette wheel selection as the chances of selecting a parent can be seen as spinning a roulette wheel with the size of the slot for each parent being proportional to its fitness. Those with the largest fitness (slot sizes) have more chance of being chosen. Thus, it is possible for one member to dominate all the others and get selected a high proportion of the time.

Tournament Selection
 In a tournament selection, ‘k’ individual is randomly selected. As the fitness of these individuals decreases, the chance for mating will be lower. An easily understandable method would be the following: select the ‘p’ best individuals from the tournament size ‘k’. Each individual will have a chance to live by its position: position ‘p’ will have a chance to be selected with 1/p. Do notice that the values have to be normalized to make computations easier. As the position is decreased, the possibility, to be selected, decreases exponentially fast. This will give much bigger chance to the good individuals in the selection part  [17].

4.      Crossover
Crossover is a genetic operator used to vary the programming of a chromosome or chromosomes from one generation to the next. Many crossover techniques exist for organisms which use different data structure to store themselves. For example:

One-Point crossover; A single crossover point in both parents’ organism strings is selected. All data beyond that point in either organism string is swapped between the two parent organisms. The resulting organisms are the children:
One Point Crossovers
 Two-point crossover; two crossover points is selected. Everything between the two points is swapped between the parent organisms, rendering two child organisms.
Two Points Crossovers
5.      Mutation
Mutation is a change in one gene in one chromosome. Mutation aimed to prevent the system trapped in a local optimum point. The gene is picked randomly to be the one that mutated. For binary encoding, mutation process can be seen in Figure 2.4 and Figure 2.5 follow:
Mutations in Binary Encoding

Another mutation technique is swapping mutation, which is swapping the location of two genes in one chromosome. This method is often used to solve combinatorial problems.
Swapping Mutation

6.      Crossover and mutation probability
Crossover probability is a ratio of how many couples of chromosomes will be picked to become a parent.
Mutation probability (or ratio) is basically a measure of the likeliness that random elements of the chromosome will be flipped into something else. For example if the chromosome is encoded as a binary string of length 100, and the mutation probability is  1% it means that 1 out of the 100 bits (on average) picked at random will be flipped.

7.      Elitism
Elitism is a method which makes a number of an individual/chromosomes which has the highest fitness value. This is done so that the chromosome with the highest fitness value remains at the top and will only be replaced if another chromosome with higher fitness value is found.

8.      Generational Replacement.
Replacement of all individuals in a population with N new individuals as the result of crossover and mutation operation at the same time.

9.      PF
Since GAs are directly applicable only to unconstrained optimization, it is necessary to use some additional methods that will keep solutions in the feasible region. The most popular approach in GA community to handle constraints is to use penalty functions that penalize infeasible solutions by reducing their fitness values in proportion to their degrees of constraint violation [22]. Penalty method transforms constrained problem to unconstrained one in two ways, which are additive form and multiplicative. The one that is used in this thesis is additive form.


    
Where p(x) presents a penalty term. If no violation occurs, p(x) will be zero and positive otherwise. Under this conversion, the overall objective function now is eval(x) which serves as an evaluation function in GAs. [22]


Source:
Carry Prameswari (2011)
Final Project Report - Literature Study
Solving Vehicle Routing Problem with Simultaneously Pickup and Delivery using Genetic Algorithm and Prins Splitting Procedure

Reference:
[5] Haupt, Randy L., Haupt, Sue Ellen. 2004. “Practical Genetic Algorithm”. John Willey & Sons, Inc. Canada
[6] Suyanto. 2005. “Algoritma genetika dalam Matlab”. Andi Yogyakarta. Yogyakarta, Indonesia
[17] Kovács, Ákos. 2008. “Solving the Vehicle Routing Problem with Genetic Algorithm and Simulated Annealing”. Högskolan Dalarna, Sweden
   [22] Yeniay, Özgür. 2005. “Penalty Function  Methods for Constrained Optimization With Genetic Algorithms”. Hacettepe University, Faculty of Science Department of Statistics. Beytepe, Ankara

No comments:

Post a Comment