HELP!

Evolutionary Algorithm (EA) Optimisation

What is an Evolutionary Algorithm?

A Evolutionary Algorithm (EA), also known as Genetic Algorithm, is a search method used to solve optimization problems inspired by the theory of evolution and bio-genetics. EA is good at exploring large non-linear search spaces for optimal or near optimal solutions.

The solution to an EA optimisation problem uses one or more strings of numeric parameters. The string is called a "Chromosome" and the parameters within it are called "Genes". This is analogous to DNA in living organisms.

The aim of an EA process is to maximise the "fitness" (or minimise the "cost function") by varying the value of Genes whilst respecting any predefined constraints. For example to select the ingredients of a recipe that maximise the value of product yield, less the cost of the ingredients used, whilst maintaining the quality of the product.

How is the EA executed?

  • EA creates a number "individuals" with random gene values to form the first "generation" .
  • EA uses the "cost function" to assesses the individuals and rank each individual in order of effectiveness (fitness to survive).
  • EA creates a new generation by randomly selecting pairs of individuals (parents) and mixing their genes to form new individuals (children). This mixing process includes is called "crossover". The "random" selection of a parent is biased towards more effective parents, to increase the ratio of children carrying the more "desirable" gene values.
  • EA can also apply very low levels of "mutation" (random changes to the gene values inherited from the parents) to help the search "escape" from an area where local minima/maxima is found.
  • Repeat steps 2 to 4 for a given number of cycles (generations), or until the fitness reaches the target, or for a specified amount of processing time.
  • The EA can also fine-tune the best solution found by making small, random, "Adaptations", to the gene values of the this individual, and check to see if this improves.

Chromosome Types

There are three types of Chromosomes available for modelling in Viabl.ai Platform:

  • Standard non-Sequence Chromosome. This is a collection of Genes, each representing a numeric value in a specified range (typically representing a "quantity").
  • Sequence Chromosome. This is a collection of ordered Genes, each representing a discrete integer value in a specified range (typically representing an ID index of a resource)
  • Picklist Chromosome. This is a collection of Genes representing a given sub-set of a set of specified integer values. (typically representing an ID index of a resource)

Crossover and Mutation are applied at binary level to Non-sequence Chromosomes but are applied to the decimal integer value (aka Gene-bound Crossover) for both Sequence and Picklist Chromosomes.

The randomness of the above process allows the effective exploration of the space of solutions. While the selection of effective solutions (individuals) and the mixing of their genes allows the accumulation of good features from partially good solutions. As a result, evolutionary algorithms can explore large domains and converge on a good solution relatively quickly. EA typically also provide a good trade-off between the time taken to find a solution and the quality of the solution. Typically it is desirable to achieve a "good" solution in a short time as opposed to the very optimal solution that may require an realistically long time. One of the aspects of EA which are not reflected in the principals of evolution theory is the option of always "cloning" the best individual from the previous generation. This insures that at the end of the process, the best individual ever found is not lost.

Using EA in the Viabl.ai Platform

See also How do I model my Resource Optimisation Application in Viabl.ai Platform?

The Viabl.ai Platform provides the ability to solve non-linear search optimisation problems through the use of Evolutionary Algorithms (EA). Optimisation problems involve the search for the values of a set of variables (Genes) that provide a valid optimal, or near optimal, solution to the problem.

The measure used to identify the optimal solution is expressed in the cost function . The algorithm invokes this cost function multiple times with a different set of variable (Gene) values and the cost function calculates the relative "cost" of the solution provided by these Gene values. An examples of a cost function might be how expensive a given solution would be to implement, or, how long a solution would take to be completed. The cost function need not represent the absolute real cost, only the cost relative to other potential solutions

Genes are grouped, at design time, into a Chromosome based on their type. An optimisation problem is expressed by one or more Chromosomes. An Individual consists of a complete set of Gene values generated by the algorithm (a suggested solution to be evaluated by the cost function). The algorithm "evolves" the best solutions over a specified number of "Generations", by allowing Gene values from "good" individuals a higher chance of being carried to the next generation.

Once the specified number of generations has been completed, the values are fixed at the "best" encountered solution and Viabl.ai Platform inference continues.

Calling the Optimization Engine

The optimization engine is called from script via a property of the xpertrule object.

A minimal example of an optimization problem might look like this...

var cost = xpertrule.optimization.run({
  settings: {
    direction: "minimize",
    nGenerations: 100,
    nIndividuals: 100,
    pCrossover: 0.6,
    pMutation: 0.05
  },
  chromosomes: [
    {
      type: "nonsequenced",
      genes: [
        {
          attribute: #speed
        },
        {
          attribute: #direction,
          minimum: 0,
          maximum: 359
        }
      ]
    }
  ],
  costFn: function() {
    return (100 - #speed.val()) + #direction.val();
  }
});
         

From the example above, you can see that there are 3 required properties (settings, chromosomes & costFn). Let's look at what each of these sections does...

settings

The settings object contains the parameters to control the optimization. From the above example...

Parameter Value Description
direction "minimize" Tells the optimization engine that the optimal solution would have the lowest cost . The cost is the value returned by the costFn function for each individual.
nGenerations "100" The number of generations (collections of individuals) that optimization should run for.
nIndividuals "100" The number of individuals (sets of input values) in each generation.
pCrossover "0.6" The probability of a child's gene value being a the result of crossover from both parents - as a oppose to just being a clone of the first parent (in this case 60% chance)
pMutation "0.05" The probability (in this case 5% chance) of gene value being mutated (flipped) during crossover.
chromosomes

The Chromosomes array contains a list of (one or more) chromosome definitions. In this example, we just have 1 non-sequenced chromosome.

Non-sequenced chromosome definitions contain a genes array which holds objects with references to the Viabl.ai Platform Numeric Objects tied to the individual's genes. These numeric objects are assigned values before each call to the cost function.

The minimum and maximum values of the individual genes can be specified in 2 different ways. See the "Additional Non-Sequenced Chromosome Settings" section for more details.

Additionally, the precision (i.e. number of decimal places) is controlled by the individual objects' "Decimal places" property.

costFn

The cost function (costFn) is called multiple times with varying sets of gene values.

The Viabl.ai Platform Numeric Objects (associated with the genes) are assigned values prior to costFn being called. It is then the job of the cost function to calculate and return a "cost" for this particular set of genes. This cost is then used by the optimization engine to converge on the optimal solution.

Tip: If you have your cost function written in a Viabl.ai procedure object called myCostFunc, the costFn parameter would look like this...

costFn: function() {
  return #myCostFunc.run();
}

Additional Optimization Settings

Along with the 5 required parameters, there are some additional settings which can enable and control enhanced optimization features...

timeout - The optional timeout parameter specifies the MAXIMUM number of seconds to perform optimization for.

settings: {
  direction: "minimize",
  nGenerations: 100,
  nIndividuals: 100,
  pCrossover: 0.6,
  pMutation: 0.05,
  timeout: 3
}

timeout=3 means after 3 seconds, stop the search for a better solution and return the best solution found so far.

Initial Seed - The optional initialSeed parameter allows the developer to pre-supply gene values (for non-sequenced chromosomes) for the first individual of the first generation. The concept is that if an "acceptable" solution is known in advance, this can be provided so as to guarantee a result at least as good as this.

At the start of optimization; if a non-sequenced gene's attribute has a value, it's value is used for the appropriate gene in the first individual of the first generation. If the attribute is empty, a random value is used instead. Note: The pre-supplied value must be in the valid range defined in the attribute.

settings: {
  direction: "minimize",
  nGenerations: 100,
  nIndividuals: 100,
  pCrossover: 0.6,
  pMutation: 0.05,
  initialSeed: true
}

Adaptation - Adaptation is the process whereby an individual has a probability of producing a number of additional (mutated) individuals.

settings: {
  direction: "minimize",
  nGenerations: 100,
  nIndividuals: 100,
  pCrossover: 0.6,
  pMutation: 0.05,
  adaptationStart: 95,
  pAdaptation: 0.1,
  nAdaptationCycles: 4
}

Adaptation is enabled by providing the adaptationStart, pAdaptation and nAdaptationSysles parameters.

  • adaptationStart "95" - Tells the optimization engine at this generation number the process of adaptation should start. The advantage of delaying the start is that later generations have more optimal solutions to work from.
  • pAdaptation "0.05" - The probability (in this case 10%) of adaptation being applied to an individual (once the generation number >= adaptationStart).
  • nAdaptationCycles "4" - The number of adaptation systems to perform (once adaptation has been started based on adaptationStart and pAdaptation.) An adaptation system takes the individual, mutates a single gene and re-costs. If the new individual is better, is in then used in subsequent adaptation systems.

Hill Climbing - Hill climbing is an additional operation which can optionally be performed at the end of the standard generations for non-sequenced chromosomes. With hill climbing enabled, individual gene values are raised and/or lowered by small increments in an attempt to find the optimal solution within the local minima identified by the main optimization process.

settings: {
  direction: "minimize",
  nGenerations: 100,
  nIndividuals: 100,
  pCrossover: 0.6,
  pMutation: 0.05,
  hillClimb: true
}

Hill climbing is enabled by providing the hillClimb parameter and setting it to true .

Additional Non-Sequenced Chromosome Settings

Non-sequenced chromosomes (as seen from the initial example) can also contain optional parameters to further control the optimization engine

Specifying Gene Minimum and Maximum Values

The minimum and maximum values for each individual gene can be specified in 2 different ways...

  1. Each gene in the optimization object can (optimally) have minimum and maximum properties to specify the range. e.g.
{
  type: "nonsequenced",
  sumConstraint: 100,
  genes: [
    {
      attribute: #flour,
      minimum: 10,
      maximum: 20
    },
    {
      attribute: #milk,
      minimum: 0,
      maximum: 100
    }
  ]
}
  1. If there is no minimum or maximum property specified for a gene, the minimum and maximum is taken from the object properties of the relevant object. Note: This is also how the precision (i.e. decimal places) is obtained.

Sum Constraint - The sum constraint setting indicates that the optimization engine should make sure that the total value of all the genes in a chromosome are a specific value.

An example use of this feature might be a problem where the genes in a chromosome represent percentage values of recipe ingredients. In this scenario, the sumConstraint would be 100 (as all genes values bust add up to 100%.)

{
  type: "nonsequenced",
  sumConstraint: 100,
  genes: [
    {
      attribute: #flour
    },
    {
      attribute: #milk
    }
  ]
}

In the above example, the values of "flour" and "milk" would always add up to 100 (i.e. flour + milk = 100)

Sequenced Chromosomes - Sequenced chromosomes contain a number of genes, each of which holds a unique value in the range 1 to number_of_genes. This type of chromosome is incredibly useful for a class of problem known as scheduling . Scheduling problems tend to represent a list of actions/locations/activities which must be completed but the optimal order is not readily known.

A classic scheduling example is known as the Travelling Salesman Problem (TSP) . In this problem, a list of cities (and their distances) is supplied and the problem is to find the shortest route between them.

If a TSP problem contained 12 cities, you would define a sequenced chromosome with 12 genes. For an individual, each gene would have a unique value in the range 1 to 12. This would be used as an index into the city array to calculate the overall distance (cost) for this individual (set of gene values).

{
  type: "sequenced",
  nGenes: 12,
  array: #cityIndexArr
}

Notice the type of chromosome is now "sequenced" and we have 2 new properties:

  • nGenes=12 is the number of genes in this chromosome. In this case genes will have a unique value between 1 and 12 (i.e. no duplicate gene values in an individual.)
  • array=#cityIndexArr is a reference to an Viabl.ai Platform Numeric Array Object. The array should be have sufficient dimensions defined to hold the number of genes (in this example the array should be at lease 12 X 1)

Picklist Chromosomes - Picklist chromosomes contain a number of genes, each of which holds a unique value in the range 1 to range (range being specified along with the chromosome definition). This type of chromosome is useful for classes of problem which require picking n unique items from a list of m possible items.

An example follows of a picklist chromosome which selects 5 unique options from a list of 10.

{
  type: "picklist",
  nGenes: 5,
  range: 10,
  array: #itemSelArr
}

Notice the type of chromosome is now "picklist" and we have 3 new properties:

  • nGenes=5 is the number of genes in this chromosome.
  • range=10 is the range of values (between 1 and range) to choose from. In this case we have 5 genes which will have a unique value between 1 and 10 (i.e. no duplicate gene values in an individual.)
  • array=#itemSelArr is a reference to an Viabl.ai Platform Numeric Array Object. The array should be to sufficient dimensions defined to hold the required number of genes (in this example the array should be at lease 5 x 1)

On This Page