| Genetic Algorithms |
Article Index for Genetic |
Website Links For Genetic |
Information AboutGenetic Algorithms |
| CATEGORIES ABOUT GENETIC ALGORITHM | |
| cybernetics | |
| evolutionary algorithms | |
| intelligence | |
| genetic algorithms | |
| optimization algorithms | |
| search algorithms | |
|
Genetic algorithms are Implement ed as a Computer Simulation in which a Population of Abstract representations (called Chromosomes or the Genotype or the Genome ) of Candidate Solutions (called individuals, creatures, or Phenotype s) to an optimization problem evolves toward better solutions. Traditionally, solutions are represented in binary as strings of 0s and 1s, but other encodings are also possible. The evolution usually starts from a population of randomly generated individuals and happens in generations. In each generation, the fitness of every individual in the population is evaluated, multiple individuals are Stochastically selected from the current population (based on their fitness), and modified (recombined and possibly randomly mutated) to form a new population. The new population is then used in the next iteration of the Algorithm . Commonly, the algorithm terminates when either a maximum number of generations has been produced, or a satisfactory fitness level has been reached for the population. If the algorithm has terminated due to a maximum number of generations, a satisfactory solution may or may not have been reached. Genetic algorithms find application in Biogenetics , Computer Science , Engineering , Economics , Chemistry , Manufacturing , Mathematics , Physics and other fields. A typical genetic algorithm requires two things to be defined: # a Genetic Representation of the solution Domain , # a Fitness Function to evaluate the solution domain. A standard representation of the solution is as an Array Of Bit s. Arrays of other types and structures can be used in essentially the same way. The main property that makes these genetic representations convenient is that their parts are easily aligned due to their fixed size, that facilitates simple crossover operation. Variable length representations may also be used, but crossover implementation is more complex in this case. Tree-like representations are explored in Genetic Programming and graph-form representations are explored in Evolutionary Programming . The fitness function is defined over the genetic representation and measures the ''quality'' of the represented solution. The fitness function is always problem dependent. For instance, in the Knapsack Problem we want to maximize the total value of objects that we can put in a knapsack of some fixed capacity. A representation of a solution might be an array of bits, where each bit represents a different object, and the value of the bit (0 or 1) represents whether or not the object is in the knapsack. Not every such representation is valid, as the size of objects may exceed the capacity of the knapsack. The ''fitness'' of the solution is the sum of values of all objects in the knapsack if the representation is valid, or 0 otherwise. In some problems, it is hard or even impossible to define the fitness expression; in these cases, Interactive Genetic Algorithm s are used. Once we have the genetic representation and the fitness function defined, GA proceeds to initialize a population of solutions randomly, then improve it through repetitive application of mutation, crossover, inversion and selection operators. Initialization Initially many individual solutions are randomly generated to form an initial population. The population size depends on the nature of the problem, but typically contains several hundreds or thousands of possible solutions. Traditionally, the population is generated randomly, covering the entire range of possible solutions (the ''search space''). Occasionally, the solutions may be "seeded" in areas where optimal solutions are likely to be found. Selection See Also: Selection (genetic algorithm) During each successive generation, a proportion of the existing population is Selected to breed a new generation. Individual solutions are selected through a ''fitness-based'' process, where Fitter solutions (as measured by a Fitness Function ) are typically more likely to be selected. Certain selection methods rate the fitness of each solution and preferentially select the best solutions. Other methods rate only a random sample of the population, as this process may be very time-consuming. Most functions are Stochastic and designed so that a small proportion of less fit solutions are selected. This helps keep the diversity of the population large, preventing premature convergence on poor solutions. Popular and well-studied selection methods include Roulette Wheel Selection and Tournament Selection . Reproduction See Also: crossover (genetic algorithm) mutation (genetic algorithm) The next step is to generate a second generation population of solutions from those selected through (also called recombination), and/or Mutation . For each new solution to be produced, a pair of "parent" solutions is selected for breeding from the pool selected previously. By producing a "child" solution using the above methods of crossover and mutation, a new solution is created which typically shares many of the characteristics of its "parents". New parents are selected for each child, and the process continues until a new population of solutions of appropriate size is generated. These processes ultimately result in the next generation population of chromosomes that is different from the initial generation. Generally the average fitness will have increased by this procedure for the population, since only the best organisms from the first generation are selected for breeding, along with a small proportion of less fit solutions, for reasons already mentioned above. Termination This generational process is repeated until a termination condition has been reached. Common terminating conditions are
Pseudo-code algorithm # Choose initial Population # Evaluate the Fitness of each Individual in the population # Repeat ## Select best-ranking individuals to Reproduce ## Breed new Generation through Crossover and Mutation (genetic operations) and give birth to Offspring ## Evaluate the individual fitnesses of the offspring ## Replace worst ranked part of population with offspring # Until Observations There are several general observations about the generation of solutions via a genetic algorithm:
VARIANTS The simplest algorithm represents each chromosome as a Bit string. Typically, numeric parameters can be represented by Integer s, though it is possible to use Floating Point representations. The floating point representation is natural to Evolution Strategies and Evolutionary Programming . The notion of real-valued genetic algorithms has been offered but is really a misnomer because it does not really represent the building block theory that was proposed by Holland in the 1970s. This theory is not without support though, based on theoretical and experimental results (see below). The basic algorithm performs crossover and mutation at the bit level. Other variants treat the chromosome as a list of numbers which are indexes into an instruction table, nodes in a Linked List , Hashes , Objects , or any other imaginable Data Structure . Crossover and mutation are performed so as to respect data element boundaries. For most data types, specific variation operators can be designed. Different chromosomal data types seem to work better or worse for different specific problem domains. When bit strings representations of integers are used, Gray Coding is often employed. In this way, small changes in the integer can be readily effected through mutations or crossovers. This has been found to help prevent premature convergence at so called ''Hamming walls'', in which too many simultaneous mutations (or crossover events) must occur in order to change the chromosome to a better solution. Other approaches involve using arrays of real-valued numbers instead of bit strings to represent chromosomes. Theoretically, the smaller the alphabet, the better the performance, but paradoxically, good results have been obtained from using real-valued chromosomes. A very successful (slight) variant of the general process of constructing a new population is to allow some of the better organisms from the current generation to carry over to the next, unaltered. This strategy is known as ''elitist selection''. Parallel implementations of genetic algorithms come in two flavours. Coarse grained parallel genetic algorithms assume a population on each of the computer nodes and migration of individuals among the nodes. Fine grained parallel genetic algorithms assume an individual on each processor node which acts with neighboring individuals for selection and reproduction. Other variants, like genetic algorithms for online optimization problems, introduce time-dependence or noise in the fitness function. It can be quite effective to combine GA with other optimization methods. GA tends to be quite good at finding generally good global solutions, but quite inefficient at finding the last few mutations to find the absolute optimum. Other techniques (such as simple hill climbing) are quite efficient at finding absolute optimum in a limited region. Alternating GA and hill climbing can improve the efficiency of GA while overcoming the lack of robustness of hill climbing. A problem that seems to be overlooked by GA-algorithms thus far is that the natural evolution maximizes Mean Fitness rather than the fitness of the individual (the criterion function used in most applications). An algorithm that maximizes mean fitness (without any need for the definition of mean fitness as a criterion function) is Gaussian Adaptation , See Kjellström 19701, provided that the ontogeny of an individual may be seen as a modified recapitulation of evolutionary random steps in the past and that the sum of many random steps tend to become Gaussian distributed (according to the Central Limit Theorem ). This means that the rules of genetic variation may have a different meaning in the natural case. For instance - provided that steps are stored in consecutive order - crossing over may sum a number of steps from maternal DNA adding a number of steps from paternal DNA and so on. This is like adding vectors that more probably may follow a ridge in the phenotypic landscape. Thus, the efficiency of the process may be increased by many orders of magnitude. Moreover, the .) Gaussian adaptation is able to approximate the natural process by an adaptation of the moment matrix of the Gaussian. So, because very many quantitative characters are Gaussian distributed in a large population, Gaussian adaptation may serve as a genetic algorithm replacing the rules of genetic variation by a Gaussian random number generator working on the phenotypic level. See Kjellström 19962 Population-based Incremental Learning is a variation where the population as a whole is evolved rather than its individual members. PROBLEM DOMAINS Problems which appear to be particularly appropriate for solution by genetic algorithms include Timetabling and Scheduling problems, and many scheduling software packages are based on GAs. GAs have also been applied to Engineering . Genetic algorithms are often applied as an approach to solve Global Optimization problems. As a general rule of thumb genetic algorithms might be useful in problem domains that have a complex Fitness Landscape as Recombination is designed to move the population away from Local Optima that a traditional Hill Climbing algorithm might get stuck in. HISTORY Computer simulations of evolution started as early as in 1954 with the work of Nils Aall Barricelli , who was using the computer at the Institute For Advanced Study in Princeton, New Jersey .3 His 1954 publication was not widely noticed. Starting in 1957, the Australian quantitative geneticist Alex Fraser published a series of papers on simulation of Artificial Selection of organisms with multiple loci controlling a measurable trait. From these beginnings, computer simulation of evolution by biologists became more common in the early 1960s, and the methods were described in books by Fraser and Burnell (1970)4 and Crosby (1973)5. Fraser's simulations included all of the essential elements of modern genetic algorithms. In addition, Hans Bremermann published a series of papers in the 1960s that also adopted a population of solution to optimization problems, undergoing recombination, mutation, and selection. Bremermann's research also included the elements of modern genetic algorithms. Other noteworthy early pioneers include Richard Friedberg, George Friedman, and Michael Conrad. Many early papers are reprinted by Fogel (1998).6 Although Barricelli, in work he reported in 1963, had simulated the evolution of ability to play a simple game,7 Artificial Evolution became a widely recognized optimization method as a result of the work of Ingo Rechenberg and Hans-Paul Schwefel in the 1960s and early 1970s - his group was able to solve complex engineering problems through Evolution Strategies (1971 PhD thesis and resulting 1973 book). Another approach was the evolutionary programming technique of Lawrence J. Fogel , which was proposed for generating artificial intelligence. Evolutionary Programming originally used finite state machines for predicting environments, and used variation and selection to optimize the predictive logics. Genetic algorithms in particular became popular through the work of John Holland in the early 1970s, and particularly his 1975 book. His work originated with studies of Cellular Automata , conducted by Holland and his students at the University Of Michigan . Research in GAs remained largely theoretical until the mid-1980s, when The First International Conference on Genetic Algorithms was held in Pittsburgh, Pennsylvania . As academic interest grew, the dramatic increase in desktop computational power allowed for practical application of the new technique. In 1989, The New York Times writer John Markoff wrote about Evolver , the first commercially available desktop genetic algorithm. Custom computer applications began to emerge in a wide variety of fields, and these algorithms are now used by a majority of Fortune 500 companies to solve difficult scheduling, data fitting, trend spotting and budgeting problems, and virtually any other type of Combinatorial Optimization problem. Most applications do not use traditional genetic algorithms but a broader set of evolutionary algorithms that incorporate facets of evolution strategies, evolutionary programming, and genetic algorithms. RELATED TECHNIQUES
BUILDING BLOCK HYPOTHESIS Genetic algorithms are simple to implement, but their behavior is difficult to understand. In particular it is difficult to understand why they are often successful in generating solutions of high fitness. The building block hypothesis (BBH) consists of # A description of an abstract adaptive mechanism that performs adaptation by recombining "building blocks", i.e. low order, low defining-length schemata with above average fitness. # A hypothesis that a genetic algorithm performs adaptation by implicitly and efficiently implementing this abstract adaptive mechanism. (Goldberg 1989:41) describes the abstract adaptive mechanism as follows: :Short, low order, and highly fit Schema ta are sampled, Recombined over , and resampled to form strings of potentially higher fitness. In a way, by working with these particular schemata building blocks , we have reduced the complexity of our problem; instead of building high-performance strings by trying every conceivable combination, we construct better and better strings from the best partial solutions of past samplings. :Just as a child creates magnificent fortresses through the arrangement of simple blocks of wood blocks , so does a genetic algorithm seek near optimal performance through the juxtaposition of short, low-order, high-performance schemata, or building blocks. (Goldberg 1989) claims that the building block hypothesis is supported by Holland's Schema Theorem . The building block hypothesis has been sharply criticized on the grounds that it lacks theoretical justification and experimental results have been published that draw its veracity into question. On the theoretical side, for example, Wright et al. state that :"The various claims about GAs that are traditionally made under the name of the ''building block hypothesis'' have, to date, no basis in theory and, in some cases, are simply incoherent"11 On the experimental side uniform crossover was seen to outperform one-point and two-point crossover on many of the fitness functions studied by Syswerda.12 Summarizing these results, Fogel remarks that :"Generally, uniform crossover yielded better performance than two-point crossover, which in turn yielded better performance than one-point crossover"13 Syswerda's results contradict the building block hypothesis because uniform crossover is extremely disruptive of short schemata whereas one and two-point crossover are more likely to conserve short schemata and combine their defining bits in children produced during recombination. The debate over the building block hypothesis demonstrates that the issue of how GAs "work", (i.e. perform adaptation) is currently far from settled. (See the External Links section for more about this) APPLICATIONS
REFERENCES
EXTERNAL LINKS
|
|
|