Programmed Evolution

Scientists and engineers employ various methods to design optimal structures, methods, programs, machines, molecules etc. One method that is gaining popularity is memetic algorithms.

What is it and how does it work?
Memetic Algorithms (MAs) are search techniques used to solve problems by mimicking molecular processes of evolution including selection, recombination, mutation and inheritance. In order to understand the basics, a few important aspects of MAs need to be considered (Figure 1).

[]The fitness landscape needs to be finite.
]The search space of the MA is limited to the fitness landscape.
[]There is at least one solution in the fitness landscape .
]A fitness function determines the relationship between the fitness of the genotype (or phenotype) and the fitness landscape.
[*]Selection is based on fitness.

Figure 1: A) Basic lay out of memetic algorithms. A population of individuals is randomly seeded with regard to fitness (initialized). The individuals are randomly mutated and their fitness is measured. Individuals with optimal fitness are further mutated until convergence of a local optima is reached. The process is carried out for the entire initialized population. The global optima is selected from the various local optima. B) Fitness landscape with local optima (A, B and D) and a global optima (C). In a memetic algorithm, the initial population of individual are randomly seeded and can be viewed as any of the arrows indicated in the figure.

Autodock (a molecular docking program) employs MAs in order to try and predict the orientation of a ligand within a protein receptor. A docking run with Autodock can be characterized by the following:

[]Finite fitness landscape: The physical properties of the protein receptor (E.g. electrostatic properties, Van der Waals interactions, desolvation energies etc.). This can be characterized as the pre-existing fitness landscape.
]Search space: Confined to the protein receptor.
[]At least one solution: The original crystallographic pose.
]Fitness function: Estimated Free Energy of Binding pose. This is determined through a combination of various interactions including Van der Waals-, electrostatic-, desolvation-, hydrogen bond- and torsional free energy.
[*]Selection (guiding function): Selection is based on fitness, i.e. The Estimated Free Energy of Binding pose.

Using Autodock as an example, a docking simulation of a ligand (molecule that binds to a protein) was run 4 times. Each time the ligand is docked, 30 populations with 250 individuals (ligands) are randomly placed within the receptor and the position of each ligand is randomly “mutated” after which the Estimated Free Energy of the pose is measured. The position of each ligand is “mutated” until a local optima of the Estimated Free Energy of a ligand is reached. The local optima of each of the four docking runs were measured and in all four runs, the convergence of the global optima (in each run) corresponded reasonably well to the crystallographic pose (RMSD<1.8). Two conclusions can be reached thus far:
1. The software can predict the best pose (biologically relevant) of a ligand in a protein with reasonable success.
2. Separate runs after random variation and selection converged on similar local optima even after random variation and selection processes in a pre-existing finite fitness landscape. The global optimum corresponded well with the original ligand pose.

It is also interesting that running the software on different occasions result in the convergence of similar ligand poses (optimal designs), even though random variation and selection processes were employed in the algorithm. Biased towards a few ends…

This software and MAs (mimicking evolutionary processes) can thus be used to design new molecules and predict optimal designs. Thus demonstrating evolution can be used to design optimal designs in pre-existing fitness landscapes.

Are there parallels with MAs and life and the universe? It should be an interesting scientific exercise to explore these possible parallels.

So let’s look at a few parallels between development and a docking simulation employing memetic algorithms.

The biased nature of development:
Primordial germ cells (PGC) are prevented from entering the somatic program and are demethylated (genome-wide erasure of existing epigenetic modifications). Then the gametes are imprinted (targeted DNA methylation) during gametogenesis, only to be demethylated again after fertilization. Then during development, DNA is methylated again, causing totipotential cells to become pluripotent. X-inactivation and reactivation of the paternal also occurs. The whole process is governed by the genetic and epigenetic program. During the unfolding of this somatic program, random variation and selection occur, ultimately leading to just a few endpoints every time it is successful. The process is constrained (few end points) as a result of pre-existing information that is set up during the initiation of the process. All this is controlled by information in the genome.

Article to demonstrate this:
Many Paths, Few Destinations: How Stem Cells Decide What They’ll Become

[QUOTE]How does a stem cell decide what specialized identity to adopt – or simply to remain a stem cell? A new study suggests that the conventional view, which assumes that cells are “instructed” to progress along prescribed signaling pathways, is too simplistic. Instead, it supports the idea that cells differentiate through the collective behavior of multiple genes in a network that ultimately leads to just a few endpoints – just as a marble on a hilltop can travel a nearly infinite number of downward paths, only to arrive in the same valley.
Just like in memetic algorithms, there are many paths to an endpoint. Both memetic algorithms in development and in the docking simulation converge on similar endpoints each time it is rerun. (e.g. skin in development).

[QUOTE]The findings, published in the May 22 issue of Nature, give a glimpse into how that collective behavior works, and show that cell populations maintain a built-in variability that nature can harness for change under the right conditions. The findings also help explain why the process of differentiating stem cells into specific lineages in the laboratory has been highly inefficient.

[QUOTE]"Nature has created an incredibly elegant and simple way of creating variability, and maintaining it at a steady level, enabling cells to respond to changes in their environment in a systematic, controlled way," adds Chang, first author on the paper.

[QUOTE]The landscape analogy and collective “decision-making” are concepts unfamiliar to biologists, who have tended to focus on single genes acting in linear pathways. This made the work initially difficult to publish, notes Huang. “It’s hard for biologists to move from thinking about single pathways to thinking about a landscape, which is the mathematical manifestation of the entirety of all the possible pathways,” he says. “A single pathway is not a good way to understand a whole process. Our goal has been to understand the driving force behind it.”
Like the docking simulation, in development there is a pre-existing fitness landscape (the womb). Both processes converge on similar endpoints each time it is run, both process are biased to a few endpoints and both processes reach local optima (e.g. skin cells) after the process id complete.

Figure 1: Similarities between development and a docking simulation employing a memetic algorithm.

There are certainly many parallels between our own designed simulated docking runs and the development of life.