A Evolutionary Algorithm (EA), also known as Genetic Algorithm, is a search method used to solve optimization problems inspired by the theory of evolution and bio-genetics. EA is good at exploring large non-linear search spaces for optimal or near optimal solutions.
The solution to an EA optimisation problem uses one or more strings of numeric parameters. The string is called a "Chromosome" and the parameters within it are called "Genes". This is analogous to DNA in living organisms.
The aim of an EA process is to maximise the "fitness" (or minimise the "cost function") by varying the value of Genes whilst respecting any predefined constraints. For example to select the ingredients of a recipe that maximise the value of product yield, less the cost of the ingredients used, whilst maintaining the quality of the product.
There are three types of Chromosomes available for modelling in Viabl.ai Platform:
Crossover and Mutation are applied at binary level to Non-sequence Chromosomes but are applied to the decimal integer value (aka Gene-bound Crossover) for both Sequence and Picklist Chromosomes.
The randomness of the above process allows the effective exploration of the space of solutions. While the selection of effective solutions (individuals) and the mixing of their genes allows the accumulation of good features from partially good solutions. As a result, evolutionary algorithms can explore large domains and converge on a good solution relatively quickly. EA typically also provide a good trade-off between the time taken to find a solution and the quality of the solution. Typically it is desirable to achieve a "good" solution in a short time as opposed to the very optimal solution that may require an realistically long time. One of the aspects of EA which are not reflected in the principals of evolution theory is the option of always "cloning" the best individual from the previous generation. This insures that at the end of the process, the best individual ever found is not lost.
See also How do I model my Resource Optimisation Application in Viabl.ai Platform?
The Viabl.ai Platform provides the ability to solve non-linear search optimisation problems through the use of Evolutionary Algorithms (EA). Optimisation problems involve the search for the values of a set of variables (Genes) that provide a valid optimal, or near optimal, solution to the problem.
The measure used to identify the optimal solution is expressed in the cost function . The algorithm invokes this cost function multiple times with a different set of variable (Gene) values and the cost function calculates the relative "cost" of the solution provided by these Gene values. An examples of a cost function might be how expensive a given solution would be to implement, or, how long a solution would take to be completed. The cost function need not represent the absolute real cost, only the cost relative to other potential solutions
Genes are grouped, at design time, into a Chromosome based on their type. An optimisation problem is expressed by one or more Chromosomes. An Individual consists of a complete set of Gene values generated by the algorithm (a suggested solution to be evaluated by the cost function). The algorithm "evolves" the best solutions over a specified number of "Generations", by allowing Gene values from "good" individuals a higher chance of being carried to the next generation.
Once the specified number of generations has been completed, the values are fixed at the "best" encountered solution and Viabl.ai Platform inference continues.
The optimization engine is called from script via a property of the xpertrule object.
A minimal example of an optimization problem might look like this...
var cost = xpertrule.optimization.run({
settings: {
direction: "minimize",
nGenerations: 100,
nIndividuals: 100,
pCrossover: 0.6,
pMutation: 0.05
},
chromosomes: [
{
type: "nonsequenced",
genes: [
{
attribute: #speed
},
{
attribute: #direction,
minimum: 0,
maximum: 359
}
]
}
],
costFn: function() {
return (100 - #speed.val()) + #direction.val();
}
});
From the example above, you can see that there are 3 required properties (settings, chromosomes & costFn). Let's look at what each of these sections does...
The settings object contains the parameters to control the optimization. From the above example...
Parameter | Value | Description |
---|---|---|
direction | "minimize" | Tells the optimization engine that the optimal solution would have the lowest cost . The cost is the value returned by the costFn function for each individual. |
nGenerations | "100" | The number of generations (collections of individuals) that optimization should run for. |
nIndividuals | "100" | The number of individuals (sets of input values) in each generation. |
pCrossover | "0.6" | The probability of a child's gene value being a the result of crossover from both parents - as a oppose to just being a clone of the first parent (in this case 60% chance) |
pMutation | "0.05" | The probability (in this case 5% chance) of gene value being mutated (flipped) during crossover. |
The Chromosomes array contains a list of (one or more) chromosome definitions. In this example, we just have 1 non-sequenced chromosome.
Non-sequenced chromosome definitions contain a genes array which holds objects with references to the Viabl.ai Platform Numeric Objects tied to the individual's genes. These numeric objects are assigned values before each call to the cost function.
The minimum and maximum values of the individual genes can be specified in 2 different ways. See the "Additional Non-Sequenced Chromosome Settings" section for more details.
Additionally, the precision (i.e. number of decimal places) is controlled by the individual objects' "Decimal places" property.
The cost function (costFn) is called multiple times with varying sets of gene values.
The Viabl.ai Platform Numeric Objects (associated with the genes) are assigned values prior to costFn being called. It is then the job of the cost function to calculate and return a "cost" for this particular set of genes. This cost is then used by the optimization engine to converge on the optimal solution.
Tip: If you have your cost function written in a Viabl.ai procedure object called myCostFunc, the costFn parameter would look like this...
costFn: function() {
return #myCostFunc.run();
}
Along with the 5 required parameters, there are some additional settings which can enable and control enhanced optimization features...
timeout - The optional timeout parameter specifies the MAXIMUM number of seconds to perform optimization for.
settings: {
direction: "minimize",
nGenerations: 100,
nIndividuals: 100,
pCrossover: 0.6,
pMutation: 0.05,
timeout: 3
}
timeout=3 means after 3 seconds, stop the search for a better solution and return the best solution found so far.
Initial Seed - The optional initialSeed parameter allows the developer to pre-supply gene values (for non-sequenced chromosomes) for the first individual of the first generation. The concept is that if an "acceptable" solution is known in advance, this can be provided so as to guarantee a result at least as good as this.
At the start of optimization; if a non-sequenced gene's attribute has a value, it's value is used for the appropriate gene in the first individual of the first generation. If the attribute is empty, a random value is used instead. Note: The pre-supplied value must be in the valid range defined in the attribute.
settings: {
direction: "minimize",
nGenerations: 100,
nIndividuals: 100,
pCrossover: 0.6,
pMutation: 0.05,
initialSeed: true
}
Adaptation - Adaptation is the process whereby an individual has a probability of producing a number of additional (mutated) individuals.
settings: {
direction: "minimize",
nGenerations: 100,
nIndividuals: 100,
pCrossover: 0.6,
pMutation: 0.05,
adaptationStart: 95,
pAdaptation: 0.1,
nAdaptationCycles: 4
}
Adaptation is enabled by providing the adaptationStart, pAdaptation and nAdaptationSysles parameters.
Hill Climbing - Hill climbing is an additional operation which can optionally be performed at the end of the standard generations for non-sequenced chromosomes. With hill climbing enabled, individual gene values are raised and/or lowered by small increments in an attempt to find the optimal solution within the local minima identified by the main optimization process.
settings: {
direction: "minimize",
nGenerations: 100,
nIndividuals: 100,
pCrossover: 0.6,
pMutation: 0.05,
hillClimb: true
}
Hill climbing is enabled by providing the hillClimb parameter and setting it to true .
Non-sequenced chromosomes (as seen from the initial example) can also contain optional parameters to further control the optimization engine
The minimum and maximum values for each individual gene can be specified in 2 different ways...
{
type: "nonsequenced",
sumConstraint: 100,
genes: [
{
attribute: #flour,
minimum: 10,
maximum: 20
},
{
attribute: #milk,
minimum: 0,
maximum: 100
}
]
}
Sum Constraint - The sum constraint setting indicates that the optimization engine should make sure that the total value of all the genes in a chromosome are a specific value.
An example use of this feature might be a problem where the genes in a chromosome represent percentage values of recipe ingredients. In this scenario, the sumConstraint would be 100 (as all genes values bust add up to 100%.)
{
type: "nonsequenced",
sumConstraint: 100,
genes: [
{
attribute: #flour
},
{
attribute: #milk
}
]
}
In the above example, the values of "flour" and "milk" would always add up to 100 (i.e. flour + milk = 100)
Sequenced Chromosomes - Sequenced chromosomes contain a number of genes, each of which holds a unique value in the range 1 to number_of_genes. This type of chromosome is incredibly useful for a class of problem known as scheduling . Scheduling problems tend to represent a list of actions/locations/activities which must be completed but the optimal order is not readily known.
A classic scheduling example is known as the Travelling Salesman Problem (TSP) . In this problem, a list of cities (and their distances) is supplied and the problem is to find the shortest route between them.
If a TSP problem contained 12 cities, you would define a sequenced chromosome with 12 genes. For an individual, each gene would have a unique value in the range 1 to 12. This would be used as an index into the city array to calculate the overall distance (cost) for this individual (set of gene values).
{
type: "sequenced",
nGenes: 12,
array: #cityIndexArr
}
Notice the type of chromosome is now "sequenced" and we have 2 new properties:
Picklist Chromosomes - Picklist chromosomes contain a number of genes, each of which holds a unique value in the range 1 to range (range being specified along with the chromosome definition). This type of chromosome is useful for classes of problem which require picking n unique items from a list of m possible items.
An example follows of a picklist chromosome which selects 5 unique options from a list of 10.
{
type: "picklist",
nGenes: 5,
range: 10,
array: #itemSelArr
}
Notice the type of chromosome is now "picklist" and we have 3 new properties: