package ‘rgp’ · rgp-package the rgp package description rgp is a simple yet flexible genetic...

76
Package ‘rgp’ February 15, 2013 Maintainer Oliver Flasch <[email protected]> License GPL-2 Title R genetic programming framework LazyData yes Author Oliver Flasch, Olaf Mersmann, Thomas Bartz-Beielstein, Joerg Stork Description RGP is a simple modular Genetic Programming (GP) system build in pure R. In addition to general GP tasks, the system supports Symbolic Regression by GP through the familiar R model formula interface. GP individuals are represented as R expressions, an (optional) type system enables domain-specific function sets containing functions of diverse domain- and range types. A basic set of genetic operators for variation (mutation and crossover) and selection is provided. Version 0.2-4 URL http://rsymbolic.org/projects/show/rgp Depends R (>= 2.13.0), methods, emoa (>= 0.4-2), rrules (>= 0.1-0) Suggests igraph (>= 0.5.5), snowfall (>= 1.84) Collate ’breeding.r’ ’building_blocks.r’ ’complexity.r’ ’stypes.r’’creation.r’ ’time_utils.r’ ’function_utils.r’ ’search_space.r’’evoluti Date Repository CRAN Date/Publication 2011-10-20 15:35:16 NeedsCompilation yes 1

Upload: others

Post on 15-Aug-2020

18 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Package ‘rgp’ · rgp-package The RGP package Description RGP is a simple yet flexible Genetic Programming system for the R environment. The system implements classical untyped

Package ‘rgp’February 15, 2013

Maintainer Oliver Flasch <[email protected]>

License GPL-2

Title R genetic programming framework

LazyData yes

Author Oliver Flasch, Olaf Mersmann, Thomas Bartz-Beielstein, Joerg Stork

Description RGP is a simple modular Genetic Programming (GP) systembuild in pure R. In addition to general GP tasks, the systemsupports Symbolic Regression by GP through the familiar R modelformula interface. GP individuals are represented as Rexpressions, an (optional) type system enables domain-specificfunction sets containing functions of diverse domain- and rangetypes. A basic set of genetic operators for variation (mutationand crossover) and selection is provided.

Version 0.2-4

URL http://rsymbolic.org/projects/show/rgp

Depends R (>= 2.13.0), methods, emoa (>= 0.4-2), rrules (>= 0.1-0)

Suggests igraph (>= 0.5.5), snowfall (>= 1.84)

Collate’breeding.r’ ’building_blocks.r’ ’complexity.r’ ’stypes.r’’creation.r’ ’time_utils.r’ ’function_utils.r’ ’search_space.r’’evolution.r’ ’data_driven_gp.r’ ’data_utils.r’’expression_utils.r’ ’fitness.r’ ’individual.r’ ’list_utils.r’’mutation.r’ ’niching.r’ ’plot_utils.r’ ’population.r’’recombination.r’ ’rgp.r’ ’selection.r’ ’similarity.r’’symbolic_regression.r’ ’test_utils.r’

Date

Repository CRAN

Date/Publication 2011-10-20 15:35:16

NeedsCompilation yes

1

Page 2: Package ‘rgp’ · rgp-package The RGP package Description RGP is a simple yet flexible Genetic Programming system for the R environment. The system implements classical untyped

2 R topics documented:

R topics documented:rgp-package . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3arity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4arity.primitive . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4breeding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5buildingBlocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6buildingBlockTags . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7customDist . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7dataDrivenGeneticProgramming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8defaultGPFunctionAndConstantSets . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10do.call.ignore.unused.arguments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11embedDataFrame . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11evolutionRestartConditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12evolutionRestartStrategies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13evolutionStopConditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13exprChildrenOrEmptyList . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14expressionComplexityMeasures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15expressionComposing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16expressionCounts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17expressionCrossover . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18expressionMutation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19expressionNames . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21expressionSimilarityMeasures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22expressionTransformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23exprLabel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24exprToPlotmathExpr . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25extractAttributes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25formatSeconds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26funcToPlotmathExpr . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26geneticProgramming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27gpIndividualVisualization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29individualAnalysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30inversePermutation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31is.sType . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31iterate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32joinElites . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32lispLists . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33listSplittingAndGrouping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34makeFunctionFitnessFunction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35makeRegressionFitnessFunction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36mse . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36multiNicheGeneticProgramming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37multiNicheSymbolicRegression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39new.alist . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41new.function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42nondeterministicRanking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42normalize . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

Page 3: Package ‘rgp’ · rgp-package The RGP package Description RGP is a simple yet flexible Genetic Programming system for the R environment. The system implements classical untyped

rgp-package 3

orderByParetoMeasure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43paretoSorting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44plotFunctions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44plotPopulationFitnessComplexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45popfitness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46populationClustering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47populationCreation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47predict.symbolicRegressionModel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49print.sType . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50randelt . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50randomExpressionChilds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51randomExpressionCreation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51randomExpressionCreationTyped . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52randomFunctionCreation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53randomFunctionCreationTyped . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54randterminalTyped . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55rangeTypeOfType . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55rgpTestingAndBenchmarking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56rmse . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57safeGPfunctions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57searchSpaceDefinition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58selectionFunctions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59smse . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61sortBy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61sortByRange . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62sortByRanking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62sortByType . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63sortingAlgorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63sTypeConstructors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64sTypeInference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65sTypeTags . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65subDataFrame . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66summary.geneticProgrammingResult . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67symbolicRegression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68tabulateFunction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70withAttributesOf . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

Index 72

rgp-package The RGP package

Description

RGP is a simple yet flexible Genetic Programming system for the R environment. The systemimplements classical untyped tree-based genetic programming as well as more advanced variantsincluding, for example, strongly typed genetic pro- gramming and Pareto genetic programming.

Page 4: Package ‘rgp’ · rgp-package The RGP package Description RGP is a simple yet flexible Genetic Programming system for the R environment. The system implements classical untyped

4 arity.primitive

Author(s)

Oliver Flasch <[email protected]>, Olaf Mersmann <[email protected]>,Joerg Stork

arity Determine the number of arguments of a function...

Description

Determine the number of arguments of a function

Usage

arity(f)

Arguments

f The function to determine the arity for.

Details

Tries to determine the number of arguments of function.

Value

The arity of the function f.

arity.primitive Determine the number of arguments of a primitive function...

Description

Determine the number of arguments of a primitive function

Usage

arity.primitive(f)

Arguments

f The primitive to determine the arity for.

Details

Tries to determine the number of arguments of a primitive R function by lookup in a builtin table.

Value

The arity of the primitive f.

Page 5: Package ‘rgp’ · rgp-package The RGP package Description RGP is a simple yet flexible Genetic Programming system for the R environment. The system implements classical untyped

breeding 5

breeding Breeding of GP individuals...

Description

Breeding of GP individuals

Usage

breed(individualFactory, breedingFitness, breedingTries, warnOnFailure=TRUE,stopOnFailure=FALSE)

Arguments

individualFactory

A function of no parameters that returns a single GP individual.breedingFitness

Either a function that takes a GP individual as its only parameter and returns asingle logical value or a function that takes a GP individual as its only parameterand returns a single real value.

breedingTries The number of breeding steps to perform. In case of a boolean breedingFitnessfunction, the actual number of breeding steps performed may be lower then thisnumber (see the details).

warnOnFailure Whether to issue a warning when a boolean breedingFitness predicate wasnot fulfilled after breedingTries tries.

stopOnFailure Whether to stop with an error message when a boolean breedingFitness pred-icate was not fulfilled after breedingTries tries.

Details

Breeds GP individuals by repeated application of an individual factory function. individualFactory.The breedingFitness must be a function of domain logical (a single boolean value) or numeric(a single real number). In case of a boolean breeding function, candidate individuals are cre-ated via the individualFactory function and tested by the breedingFitness predicate until thebreedingFitness predicate is TRUE or breedingTries tries were done, in which case the last in-dividual created and tested is returned. In case of a numerical breeding function, breedingTriesindividuals are created and evaluated by the breedingFitness function. The individual with theminimal breeding fitness is returned.

Value

The GP individual that was bred.

Page 6: Package ‘rgp’ · rgp-package The RGP package Description RGP is a simple yet flexible Genetic Programming system for the R environment. The system implements classical untyped

6 buildingBlocks

buildingBlocks Support for GP buidling blocks...

Description

Support for GP buidling blocks

Usage

buildingBlock(expr, hardness=1)buildingBlockq(expr, hardness=1)

Arguments

expr The expresion to transform to a building block.

hardness The strength of the protection against varition inside the building block. Mustbe a numeric in the interval [0.0, 1.0]. A hardness of 1.0 (the default) meansthat the building block will never be subject to variation.

Details

buildingBlock: Building blocks are a means for protecting expression subtrees from modificationthrough variation operators. Often, certain functional units, represented as expression subtreesin GP individuals, should stay intact during evolutionary search. Building blocks at the leafs ofexpressions can be introduced by adding them to the input variable set. Support for building blocksis planned for a future release of RGP.

buildingBlock transforms an R expression to a building block to be used as an element of theinput variable (or function) set. The parameter hardness (a numerical value in the interval [0.0 ,1.0]) determines the protection strength against variation inside the building building block. Whenhardness is set to 1.0 (the default), the building block will never be subject to variaton throughmutation or crossover. buildingBlockq is equivaltent to buildingBlock, but quotes it’s argumentexpr first.

Value

buildingBlock: A building block.

Page 7: Package ‘rgp’ · rgp-package The RGP package Description RGP is a simple yet flexible Genetic Programming system for the R environment. The system implements classical untyped

buildingBlockTags 7

buildingBlockTags Building block tags...

Description

Building block tags

Usage

buildingBlockTag(x)‘buildingBlockTag<-‘(x, value)hasBuildingBlockTag(x)

Arguments

x An expression node.

value The value of the building block tag. Must be a numerical in the interval [0.01.0].

Details

buildingBlockTag: To implement buidling blocks, i.e. subexpression protected from variation,expression nodes may be tagged with buildingBlockTags. TODO

customDist A dist function that supports custom metrics...

Description

A dist function that supports custom metrics

Usage

customDist(x, metric, diag=FALSE, upper=FALSE)

Arguments

x A vector or list of objects.

metric A metric, i.e. a function of two arguments that returns a numeric. Note that ametric must be definite and symmetric, otherwise the results will be undefined.

diag TRUE iff the diagonal of the distance matrix should be printed by print.dist.

upper TRUE iff the upper triangle of the distance matrix should be printed by print.dist.

Page 8: Package ‘rgp’ · rgp-package The RGP package Description RGP is a simple yet flexible Genetic Programming system for the R environment. The system implements classical untyped

8 dataDrivenGeneticProgramming

Details

This function computes and returns the distance matrix computed by using the given metric tocompute the distances between the rows of a data list or vector. Note that in contrast to dist, xhas to be a vector and the the distance metric is an arbitrary function that must be symmetric anddefinite.

Value

A distance matrix.

See Also

dist

dataDrivenGeneticProgramming

Data-driven untyped standard genetic programming...

Description

Data-driven untyped standard genetic programming

Usage

dataDrivenGeneticProgramming(formula, data, fitnessFunctionFactory,fitnessFunctionFactoryParameters=list(),stopCondition=makeTimeStopCondition(5), population,populationSize=100, eliteSize=ceiling(0.1 * populationSize),elite=list(), extinctionPrevention=FALSE, archive=FALSE,genealogy=FALSE, functionSet=mathFunctionSet,constantSet=numericConstantSet,selectionFunction=makeTournamentSelection(),crossoverFunction=crossover, mutationFunction,restartCondition=makeEmptyRestartCondition(),restartStrategy=makeLocalRestartStrategy(),breedingFitness=function(individual) TRUE, breedingTries=50,progressMonitor, verbose=TRUE)

Arguments

formula A formula describing the task. Only simple formulas of the form response ~ variable1 + ... + variableNare supported at this point in time.

data A data.frame containing training data for the GP run. The variables in formulamust match column names in this data frame.

Page 9: Package ‘rgp’ · rgp-package The RGP package Description RGP is a simple yet flexible Genetic Programming system for the R environment. The system implements classical untyped

dataDrivenGeneticProgramming 9

fitnessFunctionFactory

A function that accepts two parameters, a codeformula, data (given as a modelframe) and the additional parameters given in fitnessFunctionFactoryParametersand returns a fitness function.

fitnessFunctionFactoryParameters

Additional parameters to pass to the fitnessFunctionFactory.stopCondition The stop condition for the evolution main loop. See makeStepsStopCondition

for details.population The GP population to start the run with. If this parameter is missing, a new GP

population of size populationSize is created through random growth.populationSize The number of individuals if a population is to be created.eliteSize The number of elite individuals to keep. Defaults to ceiling(0.1 * populationSize).elite The elite list, must be alist of individuals sorted in ascending order by their first

fitness component.extinctionPrevention

When set to TRUE, the initialization and selection steps will try to prevent du-plicate individuals from occurring in the population. Defaults to FALSE, as thisoperation might be expensive with larger population sizes.

archive If set to TRUE, all GP individuals evaluated are stored in an archive list archiveListthat is returned as part of the result of this function.

genealogy If set to TRUE, the parent(s) of each indiviudal is stored in an archive genealogyListas part of the result of this function, enabling the reconstruction of the completegenealogy of the result population. This parameter implies archive = TRUE.

functionSet The function set.constantSet The set of constant factory functions.selectionFunction

The selection function to use. Defaults to tournament selection. See makeTour-namentSelection for details.

crossoverFunction

The crossover function.mutationFunction

The mutation function.restartCondition

The restart condition for the evolution main loop. See makeEmptyRestartCon-dition for details.

restartStrategy

The strategy for doing restarts. See makeLocalRestartStrategy for details.breedingFitness

A "breeding" function. This function is applied after every stochastic opera-tion Op that creates or modifies an individal (typically, Op is a initialization,mutation, or crossover operation). If the breeding function returns TRUE on thegiven individual, Op is considered a success. If the breeding function returnsFALSE, Op is retried a maximum of breedingTries times. If this maximumnumber of retries is exceeded, the result of the last try is considered as the resultof Op. In the case the breeding function returns a numeric value, the breedingis repeated breedingTries times and the individual with the lowest breedingfitness is considered the result of Op.

Page 10: Package ‘rgp’ · rgp-package The RGP package Description RGP is a simple yet flexible Genetic Programming system for the R environment. The system implements classical untyped

10 defaultGPFunctionAndConstantSets

breedingTries In case of a boolean breedingFitness function, the maximum number of re-tries. In case of a numerical breedingFitness function, the number of breed-ing steps. Also see the documentation for the breedingFitness parameter.Defaults to 50.

progressMonitor

A function of signature function(population, fitnessfunction, stepNumber, evaluationNumber,bestFitness, timeElapsed)to be called with each evolution step.

verbose Whether to print progress messages.

Details

Perform an untyped genetic programming using a fitness function that depends on a R data frame.Typical applications are data mining tasks such as symbolic regression or classification. The task isspecified as a formula and a fitness function factory. Only simple formulas without interactions aresupported. The result of the data-driven GP run is a model structure containing the formulas and anuntyped GP population. This function is primarily an intermediate for extensions. End-users willprobably use more specialized GP tools such as symbolicRegression.

Value

A model structure that contains the formula and an untyped GP population.

See Also

geneticProgramming

defaultGPFunctionAndConstantSets

Default function- and constant factory sets for Genetic Programming...

Description

Default function- and constant factory sets for Genetic Programming

Details

arithmeticFunctionSet: arithmeticFunctionSet is an untyped function set containing thefunctions "+", "-", "*", and "/". expLogFunctionSet is an untyped function set containing thefunctions "sqrt", "exp", and "ln". trigonometricFunctionSet is an untyped function set contain-ing the functions "sin", "cos", and "tan". mathFunctionSet is an untyped function set containingall of the above functions.

numericConstantSet is an untyped constant factory set containing a single constant factory thatcreates numeric constants via calls to runif(1, -1, 1).

Page 11: Package ‘rgp’ · rgp-package The RGP package Description RGP is a simple yet flexible Genetic Programming system for the R environment. The system implements classical untyped

do.call.ignore.unused.arguments 11

do.call.ignore.unused.arguments

A variant of do...

Description

A variant of do.call that ignores unused arguments

Usage

do.call.ignore.unused.arguments(what, args, quote=FALSE, envir=parent.frame())

Arguments

what What to call (either a function or a character vector naming a function in envir.args The args for the call, these may include arguments not used by what.quote Whether to quote the arguments.envir The environment within which to evaluate the call.

Value

The result of the call.

embedDataFrame Embed columns in a data frame...

Description

Embed columns in a data frame

Usage

embedDataFrame(x, cols, dimension=1)

Arguments

x The data frame containing the columns to embed.cols A vector a list of the names of the columns to embed.dimension The additional dimensions to generate when embedding.

Details

Embeds the columns named cols in the data frame x into a space of dimension dimension.

Value

The data frame, augmented with embedded columns, shortended by dimension rows.

Page 12: Package ‘rgp’ · rgp-package The RGP package Description RGP is a simple yet flexible Genetic Programming system for the R environment. The system implements classical untyped

12 evolutionRestartConditions

evolutionRestartConditions

Evolution restart conditions...

Description

Evolution restart conditions

Usage

makeFitnessStagnationRestartCondition(fitnessHistorySize=100, testFrequency=10,fitnessStandardDeviationLimit=1e-06)

makeFitnessDistributionRestartCondition(testFrequency=100, fitnessStandardDeviationLimit=1e-06)

Arguments

fitnessHistorySize

The number of past best fitness values to look at when calculating the best fitnessstandard deviation for makeFitnessStagnationRestartCondition.

testFrequency The frequency to test for the restart condition, in evolution steps. This parameteris mainly used with restart condititions that are expensive to calculate.

fitnessStandardDeviationLimit

The best fitness standard deviation limit for makeFitnessStagnationRestartCondition.

Details

makeEmptyRestartCondition: Evolution restart conditions are predicates (functions that return asingle logical value) of the signature function(population, fitnessFunction, stepNumber, evaluationNumber,bestFitness, timeElapsed).They are used to decide when to restart a GP evolution run that might be stuck in a local optimum.Evolution restart conditions are objects of the same type and class as evolution stop conditions.They may be freely substituted for each other.

makeEmptyRestartCondition creates a restart condition that is never fulfilled, i.e. restarts willnever occur. makeFitnessStagnationRestartCondition creates a restart strategy that holdsif the standard deviation of a last fitnessHistorySize best fitness values falls below a givenfitnessStandardDeviationLimit. makeFitnessDistributionRestartCondition creates a restartstrategy that holds if the standard deviation of the fitness values of the individuals in the current pop-ulation falls below a given fitnessStandardDeviationLimit.

Page 13: Package ‘rgp’ · rgp-package The RGP package Description RGP is a simple yet flexible Genetic Programming system for the R environment. The system implements classical untyped

evolutionRestartStrategies 13

evolutionRestartStrategies

Evolution restart strategies...

Description

Evolution restart strategies

Usage

makeLocalRestartStrategy(populationType, extinctionPrevention=FALSE,breedingFitness=function(individual) TRUE, breedingTries=50)

Arguments

populationType The sType of the replacement individuals, defaults to NULL for creating untypedpopulations.

extinctionPrevention

Whether to surpress duplicate individuals in newly initialized populations. SeegeneticProgramming for details.

breedingFitness

A breeding function. See the documentation for geneticProgramming for de-tails.

breedingTries The number of breeding steps.

Details

Evolution restart strategies are functions of the signature function(fitnessFunction,population, populationSize, functionSet, inputVariables, constantSet)that return a list of two obtjects: First, a population that replace the run’s current population. Sec-ond, a list of elite individuals to keep.

makeLocalRestartStrategy creates a restart strategy that replaces all individuals with new in-dividuals. The single best individual is returned as the elite. When using a multi-criterial fitnessfunction, only the first component counts in the fitness sorting.

evolutionStopConditions

Evolution stop conditions...

Description

Evolution stop conditions

Page 14: Package ‘rgp’ · rgp-package The RGP package Description RGP is a simple yet flexible Genetic Programming system for the R environment. The system implements classical untyped

14 exprChildrenOrEmptyList

Usage

makeStepsStopCondition(stepLimit)makeEvaluationsStopCondition(evaluationLimit)makeFitnessStopCondition(fitnessLimit)makeTimeStopCondition(timeLimit)‘&.stopCondition‘(e1, e2)‘|.stopCondition‘(e1, e2)‘!.stopCondition‘(e1)

Arguments

stepLimit The maximum number of evolution steps for makeStepsStopCondition.evaluationLimit

The maximum number of fitness function evaluations for makeEvaluationsStopCondition.

fitnessLimit The minimum fitness for makeFitnessStopCondition.

timeLimit The maximum runtime in seconds for makeTimeStopCondition.

e1 A stop condition.

e2 A stop condition.

Details

makeStepsStopCondition: Evolution stop conditions are predicates (functions that return a singlelogical value) of the signature function(population, stepNumber, evaluationNumber, bestFitness,timeElapsed).They are used to decide when to finish a GP evolution run. Stop conditions must be members of theS3 class c("stopCondition", "function"). They can be combined using the generic and (&), or(|) and not (!) functions.

makeStepsStopCondition creates a stop condition that is fulfilled if the number of evolution stepsexceeds a given limit. makeEvaluationsStopCondition creates a stop condition that is fulfilledif the number of fitness function evaluations exceeds a given limit. makeFitnessStopConditioncreates a stop condition that is fulfilled if the number best fitness seen in an evaluation run undercutsa certain limit. makeTimeStopCondition creates a stop condition that is fulfilled if the run time (inseconds) of an evolution run exceeds a given limit.

exprChildrenOrEmptyList

Return the Children of an Expression or the Empty List if there areNone...

Description

Return the Children of an Expression or the Empty List if there are None

Usage

exprChildrenOrEmptyList(expr)

Page 15: Package ‘rgp’ · rgp-package The RGP package Description RGP is a simple yet flexible Genetic Programming system for the R environment. The system implements classical untyped

expressionComplexityMeasures 15

Arguments

expr The expression to return the children for.

Details

Internal tool function that returns the children expressions of an R expression or the empty list ifthere are no children, i.e. if the expression is atomic or NULL. If the expression is a "function"expression, i.e. an expression that would evaluate to a function, exprChildrenOrEmptyList willreturn the function body expression as the only child.

Value

The expression’s children as a list, or the empty list if there are none.

expressionComplexityMeasures

Complexity measures for R functions and expressions...

Description

Complexity measures for R functions and expressions

Usage

exprDepth(expr)funcDepth(func)exprSize(expr)exprLeaves(expr)exprCount(expr, predicate=function(node) TRUE)funcSize(func)funcLeaves(func)funcCount(func, predicate=function(node) TRUE)exprVisitationLength(expr)exprVisitationLengthRecursive(expr)funcVisitationLength(func)

Arguments

expr An R expression.

func An R function.

predicate An R predicate (function with range type logical).

Page 16: Package ‘rgp’ · rgp-package The RGP package Description RGP is a simple yet flexible Genetic Programming system for the R environment. The system implements classical untyped

16 expressionComposing

Details

exprDepth: exprDepth returns the depth of the tree representation ("exression tree") of an R ex-pression. funcDepth returns the tree depth of the body expression of an R function. exprSizereturns the number of nodes in the tree of an R expression. exprLeaves returns the number ofleave nodes in the tree of an R expression. exprCount returns the number of tree nodes in an Rexpression matching a given predicate. funcSize returns the number of nodes in the body expres-sion tree of an R function. funcLeaves returns the number of leave nodes in the body expressiontree of an R function. funcCount returns the number of nodes in an R function body expressionmatching a given predicate. exprVisitationLength returns the visitation length of the tree of anR expression. The visitation length is the total number of nodes in all possible subtrees of a tree.funcVisitationLength returns the visitation length of the body expression tree of an R function.The visitation length can be interpreted as the size of the expression obtained by substituting allinner functions by their function bodies (see "Crossover Bias in Genetic Programming", MaartenKeijzer and James Foster).

expressionComposing Functions for decomposing and recombining R expressions...

Description

Functions for decomposing and recombining R expressions

Usage

subexpressions(expr)

Arguments

expr An R expression.

Details

subExpressions returns a list of all subexpressions (subtrees) of an expression expr.

Value

The decomposed or recombined expression.

Page 17: Package ‘rgp’ · rgp-package The RGP package Description RGP is a simple yet flexible Genetic Programming system for the R environment. The system implements classical untyped

expressionCounts 17

expressionCounts Upper bounds for expression tree search space sizes...

Description

Upper bounds for expression tree search space sizes

Usage

exprShapesOfDepth(funcset, n)exprShapesOfMaxDepth(funcset, n)exprsOfDepth(funcset, inset, n)exprsOfMaxDepth(funcset, inset, n)exprShapesOfSize(funcset, n)exprShapesOfMaxSize(funcset, n)exprsOfSize(funcset, inset, n)exprsOfMaxSize(funcset, inset, n)

Arguments

funcset The function set.

inset The set of input variables.

n The fixed size or depth.

Details

exprShapesOfDepth: These functions return the number of structurally different expressions or ex-pression shapes of a given depth or size that can be build from a fixed function- and input-variableset. Here, "expression shape" means the shape of an expression tree, not taking any node labelsinto account. exprShapesOfDepth returns the number of structurally different expression shapesof a depth exactly equal to n. exprShapesOfMaxDepth returns the number of structurally differentexpression shapes of a depth less or equal to n. exprsOfDepth returns the number of structurallydifferent expressions of a depth exactly equal to n. Note that constants are handled by conceptuallysubstiting them with a fresh input variable. exprShapesOfMaxDepth returns the number of struc-turally different expressions of a depth less or equal to n. Note that constants are handled by con-ceptually substiting them with a fresh input variable. exprShapesOfSize, exprShapesOfMaxSize,exprsOfSize, exprsOfMaxSize are equivalents that regard expression tree size (number of nodes)instead of expression tree depth.

Page 18: Package ‘rgp’ · rgp-package The RGP package Description RGP is a simple yet flexible Genetic Programming system for the R environment. The system implements classical untyped

18 expressionCrossover

expressionCrossover Random crossover (recombination) of functions and expressions...

Description

Random crossover (recombination) of functions and expressions

Usage

crossover(func1, func2, crossoverprob=0.1, breedingFitness=function(individual)TRUE, breedingTries=50)

crossoverexpr(expr1, expr2, crossoverprob)crossoverTyped(func1, func2, crossoverprob=0.1, breedingFitness=function(individual)

TRUE, breedingTries=50)crossoverexprTyped(expr1, expr2, crossoverprob)

Arguments

expr1 The first parent R expression.

func1 The first parent R function.

expr2 The second parent R expression.

func2 The second parent R function.

crossoverprob The probability of crossover at each node of the first parent function (expres-sion).

breedingFitness

A breeding function. See the documentation for geneticProgramming for de-tails.

breedingTries The number of breeding steps.

Details

crossover: Replace a random subtree of func1 (expr1) with a random subtree of func2 (expr2)and return the resulting function (expression), i.e. the modified func1 (expr1). crossoverexprhandles crossover of expressions instead of functions. crossoverTyped and crossoverexprTypedonly exchage replace subtress if the sTypes of their root nodes match.

All RGP recombination operators operating on functions have the S3 class c("recombinationOperator", "function").

Value

crossover: The child function (expression).

Page 19: Package ‘rgp’ · rgp-package The RGP package Description RGP is a simple yet flexible Genetic Programming system for the R environment. The system implements classical untyped

expressionMutation 19

expressionMutation Random mutation of functions and expressions...

Description

Random mutation of functions and expressions

Usage

mutateFunc(func, funcset, mutatefuncprob=0.1, breedingFitness=function(individual)TRUE, breedingTries=50)

mutateSubtree(func, funcset, inset, conset, mutatesubtreeprob=0.1, maxsubtreedepth=5,breedingFitness=function(individual) TRUE, breedingTries=50)

mutateNumericConst(func, mutateconstprob=0.1, breedingFitness=function(individual) TRUE,breedingTries=50)

mutateFuncTyped(func, funcset, mutatefuncprob=0.1, breedingFitness=function(individual)TRUE, breedingTries=50)

mutateSubtreeTyped(func, funcset, inset, conset, mutatesubtreeprob=0.1, maxsubtreedepth=5,breedingFitness=function(individual) TRUE, breedingTries=50)

mutateNumericConstTyped(func, mutateconstprob=0.1, breedingFitness=function(individual) TRUE,breedingTries=50)

mutateChangeLabel(func, funcset, inset, conset, strength=1,breedingFitness=function(individual) TRUE, breedingTries=50)

mutateInsertSubtree(func, funcset, inset, conset, strength=1, subtreeDepth=2,breedingFitness=function(individual) TRUE, breedingTries=50)

mutateDeleteSubtree(func, funcset, inset, conset, strength=1, subtreeDepth=2,constprob=0.2, breedingFitness=function(individual) TRUE,breedingTries=50)

mutateChangeDeleteInsert(func, funcset, inset, conset, strength=1, subtreeDepth=2,constprob=0.2, iterations=1, changeProbability=1/3,deleteProbability=1/3, insertProbability=1/3,breedingFitness=function(individual) TRUE, breedingTries=50)

mutateDeleteInsert(func, funcset, inset, conset, strength=1, subtreeDepth=2,constprob=0.2, iterations=1, deleteProbability=0.5,insertProbability=0.5, breedingFitness=function(individual) TRUE,breedingTries=50)

Arguments

func The function to mutate randomly.

funcset The function set.

inset The set of input variables.

conset The set of constant factories.

mutatefuncprob The probability of trying to replace an inner function at each node.mutatesubtreeprob

The probability of replacing a subtree with a newly grown subtree at each node.

Page 20: Package ‘rgp’ · rgp-package The RGP package Description RGP is a simple yet flexible Genetic Programming system for the R environment. The system implements classical untyped

20 expressionMutation

maxsubtreedepth

The maximum depth of newly grown subtrees.mutateconstprob

The probability of mutating a constant by adding rnorm(1) to it.

strength The number of individual point mutations (changes, insertions, deletions) toperform.

subtreeDepth The depth of the subtrees to insert or delete.

constprob The probability of creating a constant versus an input variable when inserting anew subtree.

iterations The number of times to apply a mutation operator to a GP individual. This canbe used as a generic way of controling the strength of the genotypic effect ofmutation.

changeProbability

The probability for selecting the mutateChangeLabel operator.deleteProbability

The probability for selecting the mutateDeleteSubtree operator.insertProbability

The probability for selecting the mutateInsertSubtree operator.breedingFitness

A breeding function. See the documentation for geneticProgramming for de-tails.

breedingTries The number of breeding steps.

Details

mutateFunc: RGP implements two sets of mutation operators. The first set is inspired by classicalGP systems. Mutation strength is controlled by giving mutation probabilities: mutateFunc mutatesa function f by recursively replacing inner function labels in f with probability mutatefuncprob.mutateSubtree mutates a function by recursively replacing inner nodes with newly grown subtreesof maximum depth maxsubtreedepth. mutateNumericConst mutates a function by perturbingeach numeric (double) constant c with probability mutateconstprob by setting c := c+rnorm(1).Note that constants of other typed than double (e.g integers) are not affected. mutateFuncTyped,mutateSubtreeTyped, and mutateNumericConstTyped are variants of the above functions thatonly create well-typed result expressions.

The second set of mutation operators features a more orthogonal design, with each individual op-erator having a only a small effect on the genotype. Mutation strength is controlled by the integralstrength parameter. mutateChangeLabel Selects a node (inner node or leaf) by uniform randomsampling and replaces the label of this node by a new label of matching type. mutateInsertSubtreeSelects a leaf by uniform random sampling and replaces it with a matching subtree of the exact depthof subtreeDepth. mutateDeleteSubtree Selects a subree of the exact depth of subtreeDepth byuniform random sampling and replaces it with a matching leaf. mutateChangeDeleteInsert Ei-ther applies mutateChangeLabel, mutateDeleteSubtree, or mutateDeleteSubtree. The prob-ability weights for selecting an operator can be supplied via the ...Probability arguments (probabilityweights are normalized to a sum of 1). mutateDeleteInsert Either applies mutateDeleteSubtreeor mutateDeleteSubtree. The probability weights for selecting an operator can be supplied viathe ...Probability arguments (probability weights are normalized to a sum of 1). The above functionsautomatically create well-typed result expressions when used in a strongly typed GP run.

Page 21: Package ‘rgp’ · rgp-package The RGP package Description RGP is a simple yet flexible Genetic Programming system for the R environment. The system implements classical untyped

expressionNames 21

All RGP mutation operators have the S3 class c("mutationOperator", "function").

Value

mutateFunc: The randomly mutated function.

expressionNames Functions for handling R symbols / names...

Description

Functions for handling R symbols / names

Usage

toName(x, copyAttributes=TRUE)extractLeafSymbols(expr)

Arguments

x The object to operate on.

expr An R expression.

copyAttributes Whether to copy all attributes of x to the result object.

Details

toName: toName converts a character string x to an R symbol / name, while copying all attributes iffcopyAttributes is TRUE. In the case that x is not a character string, a copy of the object is returnedas-is. extractLeafSymbols returns the set of symbols (names) at the leafs of an expression expr.The symbols are returned as character strings.

Value

toName: The result.

Page 22: Package ‘rgp’ · rgp-package The RGP package Description RGP is a simple yet flexible Genetic Programming system for the R environment. The system implements classical untyped

22 expressionSimilarityMeasures

expressionSimilarityMeasures

Similarity and Distance Measures for R Functions and Expressions...

Description

Similarity and Distance Measures for R Functions and Expressions

Usage

commonSubexpressions(expr1, expr2)numberOfCommonSubexpressions(expr1, expr2)normalizedNumberOfCommonSubexpressions(expr1, expr2)NCSdist(expr1, expr2)sizeWeightedNumberOfCommonSubexpressions(expr1, expr2)normalizedSizeWeightedNumberOfCommonSubexpressions(expr1, expr2)SNCSdist(expr1, expr2)differingSubexpressions(expr1, expr2)numberOfDifferingSubexpressions(expr1, expr2)sizeWeightedNumberOfDifferingSubexpressions(expr1, expr2)trivialMetric(a, b)normInducedTreeDistance(norm, labelDistance=trivialMetric, distanceFoldOperator)normInducedFunctionDistance(norm, labelDistance=trivialMetric, distanceFoldOperator)

Arguments

expr1 An R expression.

expr2 An R expression.

a An R object.

b An R object.

norm A norm to derive a tree distance metric from.

labelDistance A metric for measuring distances of tree node labels, i.e. function names orconstants.

distanceFoldOperator

The operator used by normInducedTreeDistance to combine the measuressubtree distances, defaults to ‘+‘.

Details

commonSubexpressions: These functions implement several similarity and distance measures forR functions (i.e. their body expressions). TODO check and document measure-theoretic propertiesof each measure defined here TODO these distance measures are metrics, some of them are norm-induced metrics commonSubexpressions returns the set of common subexpressions of expr1 andexpr2. This is not a metric by itself, but can be used to implement several subtree-based similarity

Page 23: Package ‘rgp’ · rgp-package The RGP package Description RGP is a simple yet flexible Genetic Programming system for the R environment. The system implements classical untyped

expressionTransformation 23

metrics. of expr1 and expr2. sizeWeightedNumberOfcommonSubexpressions returns the num-ber of common subexpressions of expr1 and expr2, weighting the size of each common subex-pression. Note that for every expression e, sizeWeightedNumberOfcommonSubexpressions( e, e ) == exprVisitationLength( e ). normalizedNumberOfCommonSubexpressions returnsthe ratio of the number of common subexpressions of expr1 and expr2 in relation to the number ofsubexpression in the larger expression of expr1 and expr2. normalizedSizeWeightedNumberOfcommonSubexpressionsreturns the ratio of the size-weighted number of common subexpressions of expr1 and expr2in relation to the visitation length of the larger expression of expr1 and expr2. NCSdist andSNCSdist are distance metrics derived from normalizedNumberOfCommonSubexpressions andnormalizedSizeWeightedNumberOfCommonSubexpressions respectively. differingSubexpressions,and codenumberOfDifferingSubexpressions are duals of the functions described above, based oncounting the number of differing subexpressions of expr1 and expr2. The possible functions "nor-malizedNumberOfDifferingSubexpressions" and "normalizedSizeWeightedNumberOfDifferingSubex-pressions" where ommited because they are always equal to NCSdist and SNCSdist by definition.trivialMetric The "trivial" metric M(a, b) that is 0 iff a == b, 1 otherwise. normInducedTreeDistanceUses a norm on expression trees and a metric on tree node labels to induce a metric M on ex-pression trees A and B: If both A and B are empty (represented as NULL), M(A, B) := 0. Ifexactly one of A or B is empty, M(A, B) := "the norm applied to the non-empty tree". If nei-ther A or B is empty, the difference of their root node labels (as measured by labelDistance)is added to the sum of the differences of the children. The children lists are padded with emptytrees to equalize their sizes. The summation operator can be changed via distanceFoldOperator.normInducedFunctionDistance Is wrapper that applies normInducedTreeDistance to the bodiesof the given functions.

expressionTransformation

Common higher-order functions for transforming R expressions...

Description

Common higher-order functions for transforming R expressions

Usage

MapExpressionNodes(f, expr, functions=TRUE, inners=FALSE, leafs=TRUE)MapExpressionLeafs(f, expr)MapExpressionSubtrees(f, expr)FlattenExpression(expr)AllExpressionNodes(p, expr)AnyExpressionNode(p, expr)

Arguments

f The function to apply.

functions Whether to apply f to the function symbols of expr. Defaults to TRUE.

inners Whether to apply f to the inner subtrees of expr. Defaults to FALSE.

Page 24: Package ‘rgp’ · rgp-package The RGP package Description RGP is a simple yet flexible Genetic Programming system for the R environment. The system implements classical untyped

24 exprLabel

leafs Wheter to apply f to the leafs of expr. Defaults to TRUE.

p The predicate to check.

expr The expression to transform.

Details

MapExpressionNodes: MapExpressionNodes transforms an expression expr by replacing everynode in the tree with the result of applying a function f. The parameters functions, inners,and leafs control if f should be applied to the function symbols, inner subtrees, and leafs ofexpr, respectively. MapExpressionLeafs and MapExpressionSubtrees are shorthands for callsto MapExpressionNodes. expr. AllExpressionNodes checks if all nodes in the tree of exprsatisfy the predicate p (p returns TRUE for every node). This function short-cuts returning FALSE assoon as a node that does not satisfy p is encountered. AnyExpressionNode checks if any node inthe tree of expr satisfies the predicate p. This function short-cuts returning TRUE as soon as a nodethat satisfies p is encountered.

Value

MapExpressionNodes: The transformed expression.

exprLabel Return the "label" at the Root Node of an Expression Tree...

Description

Return the "label" at the Root Node of an Expression Tree

Usage

exprLabel(expr)

Arguments

expr The expression to return the root label for.

Details

Internal tool function that returns the function name if expr is a call, or otherwise just expr itself.

Value

The expression’s root label.

Page 25: Package ‘rgp’ · rgp-package The RGP package Description RGP is a simple yet flexible Genetic Programming system for the R environment. The system implements classical untyped

exprToPlotmathExpr 25

exprToPlotmathExpr Convert any expression to an expression that is plottable by plotmath...

Description

Convert any expression to an expression that is plottable by plotmath

Usage

exprToPlotmathExpr(expr)

Arguments

expr The GP-generated expression to convert.

Details

Tries to convert a GP-generated expression expr to an expression plottable by plotmath by replac-ing GP variants of arithmetic operators by their standard counterparts.

Value

An expression plottable by plotmath.

extractAttributes Extract a given attribute of all objects in a list and tag that list withthe...

Description

Extract a given attribute of all objects in a list and tag that list with the list of extracted attributes

Usage

extractAttributes(x, extractAttribute, tagAttribute=extractAttribute, default)

Arguments

x A list with objects containing the attribute attribute.extractAttribute

The attribute to extract from all objects in the list x.tagAttribute The name of the attribute for x holding the list of extracted attributes.default A default value to return if an object in x has no attribute attribute.

Value

The list x, tagged with a new attribute tagAttribute.

Page 26: Package ‘rgp’ · rgp-package The RGP package Description RGP is a simple yet flexible Genetic Programming system for the R environment. The system implements classical untyped

26 funcToPlotmathExpr

formatSeconds Format time and data values into human-readable character vectors...

Description

Format time and data values into human-readable character vectors

Usage

formatSeconds(seconds, secondDecimals=2)

Arguments

seconds A numeric vector denoting seconds.

secondDecimals The number of decimal places to show for seconds. Defaults to 2.

Details

These functions convert date and time values into human-readable character vectors. formatSecondsformats time values given as a numerical vector denoting seconds into human-readable charactervectors, i.e. formatSeconds(70) results in the string "1 minute, 10 seconds".

Value

A character vector containg a human-readable representation of the given date/time.

funcToPlotmathExpr Convert a function to an expression plottable by plotmath...

Description

Convert a function to an expression plottable by plotmath

Usage

funcToPlotmathExpr(func)

Arguments

func The function to convert.

Details

Tries to convert a function func to an expression plottable by plotmath by replacing arithmeticoperators and "standard" functions by plottable counterparts.

Page 27: Package ‘rgp’ · rgp-package The RGP package Description RGP is a simple yet flexible Genetic Programming system for the R environment. The system implements classical untyped

geneticProgramming 27

Value

An expression plottable by plotmath.

See Also

funcToIgraph

geneticProgramming Standard typed and untyped genetic programming...

Description

Standard typed and untyped genetic programming

Usage

geneticProgramming(fitnessFunction, stopCondition=makeTimeStopCondition(5), population,populationSize=100, eliteSize=ceiling(0.1 * populationSize),elite=list(), functionSet=mathFunctionSet,inputVariables=inputVariableSet("x"),constantSet=numericConstantSet,selectionFunction=makeTournamentSelection(),crossoverFunction=crossover, mutationFunction,restartCondition=makeEmptyRestartCondition(),restartStrategy=makeLocalRestartStrategy(),breedingFitness=function(individual) TRUE, breedingTries=50,extinctionPrevention=FALSE, archive=FALSE, genealogy=FALSE,progressMonitor, verbose=TRUE)

typedGeneticProgramming(fitnessFunction, type, stopCondition=makeTimeStopCondition(5),population, populationSize=100, eliteSize=ceiling(0.1 *populationSize), elite=list(), functionSet, inputVariables,constantSet, selectionFunction=makeTournamentSelection(),crossoverFunction=crossoverTyped, mutationFunction,restartCondition=makeEmptyRestartCondition(),restartStrategy=makeLocalRestartStrategy(populationType = type),breedingFitness=function(individual) TRUE, breedingTries=50,extinctionPrevention=FALSE, archive=FALSE, genealogy=FALSE,progressMonitor, verbose=TRUE)

Arguments

fitnessFunction

In case of a single-objective selection function, fitnessFunction must be asingle function that assigns a numerical fitness value to a GP individual rep-resented as a R function. Smaller fitness values mean higher/better fitness. Ifa multi-objective selection function is used, fitnessFunction must return anumerical vector of fitness values.

Page 28: Package ‘rgp’ · rgp-package The RGP package Description RGP is a simple yet flexible Genetic Programming system for the R environment. The system implements classical untyped

28 geneticProgramming

type The range type of the individual functions. This parameter only applies totypedGeneticProgramming.

stopCondition The stop condition for the evolution main loop. See makeStepsStopConditionFor details.

population The GP population to start the run with. If this parameter is missing, a new GPpopulation of size populationSize is created through random growth.

populationSize The number of individuals if a population is to be created.

eliteSize The number of elite individuals to keep. Defaults to ceiling(0.1 * populationSize).

elite The elite list, must be alist of individuals sorted in ascending order by their firstfitness component.

functionSet The function set.

inputVariables The input variable set.

constantSet The set of constant factory functions.selectionFunction

The selection function to use. Defaults to tournament selection. See makeTour-namentSelection for details.

crossoverFunction

The crossover function.mutationFunction

The mutation function.restartCondition

The restart condition for the evolution main loop. See makeEmptyRestartCon-dition for details.

restartStrategy

The strategy for doing restarts. See makeLocalRestartStrategy for details.breedingFitness

A "breeding" function. This function is applied after every stochastic opera-tion Op that creates or modifies an individal (typically, Op is a initialization,mutation, or crossover operation). If the breeding function returns TRUE on thegiven individual, Op is considered a success. If the breeding function returnsFALSE, Op is retried a maximum of breedingTries times. If this maximumnumber of retries is exceeded, the result of the last try is considered as the resultof Op. In the case the breeding function returns a numeric value, the breedingis repeated breedingTries times and the individual with the lowest breedingfitness is considered the result of Op.

breedingTries In case of a boolean breedingFitness function, the maximum number of re-tries. In case of a numerical breedingFitness function, the number of breed-ing steps. Also see the documentation for the breedingFitness parameter.Defaults to 50.

extinctionPrevention

When set to TRUE, the initialization and selection steps will try to prevent du-plicate individuals from occurring in the population. Defaults to FALSE, as thisoperation might be expensive with larger population sizes.

archive If set to TRUE, all GP individuals evaluated are stored in an archive list archiveListthat is returned as part of the result of this function.

Page 29: Package ‘rgp’ · rgp-package The RGP package Description RGP is a simple yet flexible Genetic Programming system for the R environment. The system implements classical untyped

gpIndividualVisualization 29

genealogy If set to TRUE, the parent(s) of each indiviudal is stored in an archive genealogyListas part of the result of this function, enabling the reconstruction of the completegenealogy of the result population. This parameter implies archive = TRUE.

progressMonitor

A function of signature function(population, fitnessfunction, stepNumber, evaluationNumber,bestFitness, timeElapsed)to be called with each evolution step.

verbose Whether to print progress messages.

Details

geneticProgramming: Perform a standard genetic programming (GP) run. Use geneticProgrammingfor untyped genetic programming or typedGeneticProgramming for typed genetic programmingruns. The required argument fitnessFunction must be supplied with an objective function thatassigns a numerical fitness value to an R function. Fitness values are minimized, i.e. smaller valuesdenote higher/better fitness. If a multi-objective selectionFunction is used, fitnessFunctionreturn a numerical vector of fitness values. The result of the GP run is a GP result object containinga GP population of R functions. summary.geneticProgrammingResult can be used to create sum-mary views of a GP result object. During the run, restarts are triggered by the restartCondition.When a restart is triggered, the restartStrategy is executed, which returns a new population to re-place the current one as well as a list of elite individuals. These are added to the runs elite list,where fitter individuals replace individuals with lesser fittness. The runs elite list is always sortedby fitness in ascending order. Only the first component of a multi-criterial fitness counts in thissorting. After a GP run, the population is inserted into the elite list. The elite list is returned as partof the GP result object.

Value

geneticProgramming: A genetic programming result object that contains a GP population in thefield population, as well as metadata describing the run parameters.

See Also

summary.geneticProgrammingResult, symbolicRegression

gpIndividualVisualization

Visualization of functions and expressions as trees...

Description

Visualization of functions and expressions as trees

Usage

funcToIgraph(func)exprToIgraph(expr)exprToGraph(expr)

Page 30: Package ‘rgp’ · rgp-package The RGP package Description RGP is a simple yet flexible Genetic Programming system for the R environment. The system implements classical untyped

30 individualAnalysis

Arguments

func An R function.

expr An R expression.

Details

funcToIgraph: The following functions plot R expressions and functions as trees. The igraphpackage is required for most of these functions. exprToGraph transforms an R expression intoa graph given as a character vector of vertices V and a even-sized numeric vector of edges E.Two elements i and i+1 in E encode a directed edge from V[i] to V[i+1]. funcToIgraph andexprToIgraph return an igraph graph object for an R function or an R expression.

Value

funcToIgraph: The result (see the details section).

See Also

funcToPlotmathExpr

individualAnalysis Functions for analysing GP individuals...

Description

Functions for analysing GP individuals

Usage

inputVariablesOfIndividual(ind, inset)

Arguments

ind A GP individual, represented as a R function.

inset A set of input variables.

Details

inputVariablesOfIndividual returns a list of input variables in inset that are used by the GPindividual ind.

Page 31: Package ‘rgp’ · rgp-package The RGP package Description RGP is a simple yet flexible Genetic Programming system for the R environment. The system implements classical untyped

inversePermutation 31

inversePermutation Calculate the inverse of a permutation...

Description

Calculate the inverse of a permutation

Usage

inversePermutation(x)

Arguments

x The permutation to return the inverse for.

Details

Returns the inverse of a permutation x given as an integer vector. This function is useful to turn aranking into an ordering and back, for example.

Value

The inverse of the permutation x.

See Also

rank, order

is.sType Check if an object is an sType...

Description

Check if an object is an sType

Usage

is.sType(x)

Arguments

x The object to check.

Details

Returns TRUE iff its argument is an sType.

Page 32: Package ‘rgp’ · rgp-package The RGP package Description RGP is a simple yet flexible Genetic Programming system for the R environment. The system implements classical untyped

32 joinElites

Value

TRUE iff x is an sType.

iterate Repeatedly apply a function...

Description

Repeatedly apply a function

Usage

iterate(n, f, arg, ...)

Arguments

n The number of times to apply f, must be >= 0. If 0, arg is returned.

f The function to apply.

arg The argument to repeatedly apply f to.

... Additional argument to pass to f at each application.

Details

Repeatedly apply a function f to an argument arg, additional arguments ... are supplied un-changed in each call. E.g. iterate(3, foo, 42.14, "bar") is equivalent to foo(foo(foo(42.14, "bar"), "bar"), "bar").

Value

The result of repeatedly applying f.

joinElites Join elite lists...

Description

Join elite lists

Usage

joinElites(individuals, elite, eliteSize, fitnessFunction)

Page 33: Package ‘rgp’ · rgp-package The RGP package Description RGP is a simple yet flexible Genetic Programming system for the R environment. The system implements classical untyped

lispLists 33

Arguments

individuals The list of individuals to insert.

elite The list of elite individuals to insert individuals into. This list must be sortedby fitness in ascending order, i.e. lower fitnesses first.

eliteSize The maximum size of the elite.fitnessFunction

The fitness function.

Details

Inserts a list of new individuals into an elite list, replacing the worst individuals in this list to makeplace, if needed.

Value

The elite with individuals inserted, sorted by fitness in ascending order, i.e. lower fitnessesfirst.

lispLists Functions for Lisp-like list processing...

Description

Functions for Lisp-like list processing

Usage

first(x)rest(x)second(x)third(x)fourth(x)fifth(x)is.empty(x)is.atom(x)is.composite(x)contains(x, elt)

Arguments

x A list or vector.

elt An element of a list or vector.

Page 34: Package ‘rgp’ · rgp-package The RGP package Description RGP is a simple yet flexible Genetic Programming system for the R environment. The system implements classical untyped

34 listSplittingAndGrouping

Details

first: Simple wrapper functions that allow Lisp-like list processing in R: first to fifth return thefirst to fifth element of the list x. rest returns all but the first element of the list x. is.empty returnsTRUE iff the list x is of length 0. is.atom returns TRUE iff the list x is of length 1. is.compositereturns TRUE iff the list x is of length > 1. contains return TRUE iff the list x contains an elementidentical to elt.

listSplittingAndGrouping

Splitting and grouping of lists...

Description

Splitting and grouping of lists

Usage

splitList(l, groupAssignment)groupListConsecutive(l, numberOfGroups)groupListDistributed(l, numberOfGroups)flatten(l, recursive=FALSE)intersperse(xs, ys, pairConstructor=list)

Arguments

l A list.

xs A list.

ys A list.pairConstructor

The function to use for constructing pairs, defaults to list.groupAssignment

A vector of group assignment indices.

numberOfGroups The number of groups to create, must be <= length(l)

recursive Whether to operate recursively on sublists or vectors.

Details

splitList: Functions for splitting and grouping lists into sublists. splitList splits a list l intomax(groupAssignment) groups. The integer indices of groupAssignment determine in whichgroup each element of l goes. groupListConsecutive splits l into numberOfGroups consecu-tive sublists (or groups). groupListDistributed distributes l into numberOfGroups sublists (orgroups). flatten flattens a list l of lists into a flat list by concatenation. If recursive is TRUE(defaults to FALSE), flatten will be recursively called on each argument first. intersperse joinstwo lists xs and ys into a list of pairs containig every possible pair, i.e. intersperse(xs, ys)equals the product list of xs and ys. The pairConstructor parameter can be used to change thetype of pairs returned.

Page 35: Package ‘rgp’ · rgp-package The RGP package Description RGP is a simple yet flexible Genetic Programming system for the R environment. The system implements classical untyped

makeFunctionFitnessFunction 35

Value

splitList: A list of lists, where each member represents a group.

makeFunctionFitnessFunction

Create a fitness function from a function of one variable...

Description

Create a fitness function from a function of one variable

Usage

makeFunctionFitnessFunction(func, from=-1, to=1, steps=128, errorMeasure=rmse, indsizelimit=NA)

Arguments

func The reference function.

from The start of the sequence of fitness cases.

to The end of the sequence of fitness cases.

steps The number of steps in the sequence of fitness cases.

errorMeasure A function to use as an error measure, defaults to RMSE.

indsizelimit Individuals exceeding this size limit will get a fitness of Inf.

Details

Creates a fitness function that calculates an error measure with respect to an arbitrary referencefunction of one variable on the sequence of fitness cases seq(from, to, length = steps). Whenan indsizelimit is given, individuals exceeding this limit will receive a fitness of Inf.

Value

A fitness function based on the reference function func.

Page 36: Package ‘rgp’ · rgp-package The RGP package Description RGP is a simple yet flexible Genetic Programming system for the R environment. The system implements classical untyped

36 mse

makeRegressionFitnessFunction

Create a fitness function for symbolic regression...

Description

Create a fitness function for symbolic regression

Usage

makeRegressionFitnessFunction(formula, data, errorMeasure=rmse, indsizelimit=NA,penalizeGenotypeConstantIndividuals=FALSE)

Arguments

formula A formula object describing the regression task.

data An optional data frame containing the variables in the model.

errorMeasure A function to use as an error measure, defaults to RMSE.

indsizelimit Individuals exceeding this size limit will get a fitness of Inf.penalizeGenotypeConstantIndividuals

Individuals that do not contain any input variables will get a fitness of Inf.

Details

Creates a fitness function that calculates an error measure with respect to a given set of data vari-ables. A simplified version of the formula syntax is used to describe the regression task. When anindsizelimit is given, individuals exceeding this limit will receive a fitness of Inf.

Value

A fitness function to be used in symbolic regression.

mse Mean squared error (MSE)...

Description

Mean squared error (MSE)

Usage

mse(x, y)

Page 37: Package ‘rgp’ · rgp-package The RGP package Description RGP is a simple yet flexible Genetic Programming system for the R environment. The system implements classical untyped

multiNicheGeneticProgramming 37

Arguments

x A numeric vector or list.

y A numeric vector or list.

Value

The MSE between x and y.

multiNicheGeneticProgramming

Cluster-based multi-niche genetic programming...

Description

Cluster-based multi-niche genetic programming

Usage

multiNicheGeneticProgramming(fitnessFunction, stopCondition=makeTimeStopCondition(25),passStopCondition=makeTimeStopCondition(5), numberOfNiches=2,clusterFunction=groupListConsecutive, joinFunction=function(niches)Reduce(c, niches), population, populationSize=100,eliteSize=ceiling(0.1 * populationSize), elite=list(),functionSet=mathFunctionSet, inputVariables=inputVariableSet("x"),constantSet=numericConstantSet,selectionFunction=makeTournamentSelection(),crossoverFunction=crossover, mutationFunction,restartCondition=makeEmptyRestartCondition(),restartStrategy=makeLocalRestartStrategy(), progressMonitor,verbose=TRUE, clusterApply=sfClusterApplyLB, clusterExport=sfExport)

Arguments

fitnessFunction

In case of a single-objective selection function, fitnessFunction must be asingle function that assigns a numerical fitness value to a GP individual rep-resented as a R function. Smaller fitness values mean higher/better fitness. Ifa multi-objective selection function is used, fitnessFunction must return anumerical vector of fitness values.

stopCondition The stop condition for the evolution main loop. See makeStepsStopConditionfor details.

passStopCondition

The stop condition for each parallel pass. See makeStepsStopCondition for de-tails.

numberOfNiches The number of niches to cluster the population into.

Page 38: Package ‘rgp’ · rgp-package The RGP package Description RGP is a simple yet flexible Genetic Programming system for the R environment. The system implements classical untyped

38 multiNicheGeneticProgramming

clusterFunction

The function used to cluster the population into niches. The first parameter ofthis function is a GP population, the second paramater an integer representingthe number of niches. Defaults to groupListConsecutive.

joinFunction The function used to join all niches into a population again after a round ofparallel passes. Defaults to a function that simply concatenates all niches.

population The GP population to start the run with. If this parameter is missing, a new GPpopulation of size populationSize is created through random growth.

populationSize The number of individuals if a population is to be created.

eliteSize The number of "elite" individuals to keep. Defaults to ceiling(0.1 * populationSize).

elite The elite list, must be alist of individuals sorted in ascending order by their firstfitness component.

functionSet The function set.

inputVariables The input variable set.

constantSet The set of constant factory functions.selectionFunction

The selection function to use. Defaults to tournament selection. See makeTour-namentSelection for details.

crossoverFunction

The crossover function.mutationFunction

The mutation function.restartCondition

The restart condition for the evolution main loop. See makeFitnessStagnation-RestartCondition for details.

restartStrategy

The strategy for doing restarts. See makeLocalRestartStrategy for details.progressMonitor

A function of signature function(population, fitnessFunction, stepNumber, evaluationNumber,bestFitness, timeElapsed)to be called with each evolution step.

verbose Whether to print progress messages.

clusterApply The cluster apply function that is used to distribute the parallel passes to CPUsin a compute cluster.

clusterExport A function that is used to export R variables to the nodes of a CPU cluster,defaults to sfExport.

Details

Perform a multi-niche genetic programming run. The required argument fitnessFunction mustbe supplied with an objective function that assigns a numerical fitness value to an R function.Fitness values are minimized, i.e. smaller values mean higher/better fitness. If a multi-objectiveselectionFunction is used, fitnessFunction return a numerical vector of fitness values. Ina multi-niche genetic programming run, the initial population is clustered via a clusterFunctioninto numberOfNiches niches. In each niche, a genetic programming run is executed with passStopConditionas stop condition. These runs are referred to as a parallel pass. After each parallel pass, the niches

Page 39: Package ‘rgp’ · rgp-package The RGP package Description RGP is a simple yet flexible Genetic Programming system for the R environment. The system implements classical untyped

multiNicheSymbolicRegression 39

are joined again using a joinFunction into a population. From here, the process starts again witha clustering step, until the global stopCondition is met. The result of the multi-niche genetic pro-gramming run is a genetic programming result object containing a GP population of R functions.summary.geneticProgrammingResult can be used to create summary views of a GP result object.

Value

A genetic programming result object that contains a GP population in the field population, as wellas metadata describing the run parameters.

See Also

geneticProgramming, summary.geneticProgrammingResult, symbolicRegression

multiNicheSymbolicRegression

Symbolic regression via multi-niche standard genetic programming...

Description

Symbolic regression via multi-niche standard genetic programming

Usage

multiNicheSymbolicRegression(formula, data, stopCondition=makeTimeStopCondition(25),passStopCondition=makeTimeStopCondition(5), numberOfNiches=2,clusterFunction=groupListConsecutive, joinFunction=function(niches)Reduce(c, niches), population, populationSize=100,eliteSize=ceiling(0.1 * populationSize), elite=list(),individualSizeLimit=64, penalizeGenotypeConstantIndividuals=FALSE,functionSet=mathFunctionSet, constantSet=numericConstantSet,selectionFunction=makeTournamentSelection(),crossoverFunction=crossover, mutationFunction,restartCondition=makeEmptyRestartCondition(),restartStrategy=makeLocalRestartStrategy(), progressMonitor,verbose=TRUE, clusterApply=sfClusterApplyLB, clusterExport=sfExport)

Arguments

formula A formula describing the regression task. Only simple formulas of the formresponse ~ variable1 + ... + variableN are supported at this point intime.

data A data.frame containing training data for the symbolic regression run. Thevariables in formula must match column names in this data frame.

stopCondition The stop condition for the evolution main loop. See makeStepsStopConditionfor details.

Page 40: Package ‘rgp’ · rgp-package The RGP package Description RGP is a simple yet flexible Genetic Programming system for the R environment. The system implements classical untyped

40 multiNicheSymbolicRegression

passStopCondition

The stop condition for each parallel pass. See makeStepsStopCondition for de-tails.

numberOfNiches The number of niches to cluster the population into.clusterFunction

The function used to cluster the population into niches. The first parameter ofthis function is a GP population, the second paramater an integer representingthe number of niches. Defaults to groupListConsecutive.

joinFunction The function used to join all niches into a population again after a round ofparallel passes. Defaults to a function that simply concatenates all niches.

population The GP population to start the run with. If this parameter is missing, a new GPpopulation of size populationSize is created through random growth.

populationSize The number of individuals if a population is to be created.

eliteSize The number of "elite" individuals to keep. Defaults to ceiling(0.1 * populationSize).

elite The elite list, must be alist of individuals sorted in ascending order by their firstfitness component.

individualSizeLimit

Individuals with a number of tree nodes that exceeds this size limit will get afitness of Inf.

penalizeGenotypeConstantIndividuals

Individuals that do not contain any input variables will get a fitness of Inf.

functionSet The function set.

constantSet The set of constant factory functions.selectionFunction

The selection function to use. Defaults to tournament selection. See makeTour-namentSelection for details.

crossoverFunction

The crossover function.mutationFunction

The mutation function.restartCondition

The restart condition for the evolution main loop. See makeFitnessStagnation-RestartCondition for details.

restartStrategy

The strategy for doing restarts. See makeLocalRestartStrategy for details.progressMonitor

A function of signature function(population, fitnessfunction, stepNumber, evaluationNumber,bestFitness, timeElapsed)to be called with each evolution step.

verbose Whether to print progress messages.

clusterApply The cluster apply function that is used to distribute the parallel passes to CPUsin a compute cluster.

clusterExport A function that is used to export R variables to the nodes of a CPU cluster,defaults to snowfall’s sfExport.

Page 41: Package ‘rgp’ · rgp-package The RGP package Description RGP is a simple yet flexible Genetic Programming system for the R environment. The system implements classical untyped

new.alist 41

Details

Perform symbolic regression via untyped multi-niche genetic programming. The regression task isspecified as a formula. Only simple formulas without interactions are supported. The result of thesymbolic regression run is a symbolic regression model containing an untyped GP population ofmodel functions.

Value

An symbolic regression model that contains an untyped GP population.

See Also

predict.symbolicRegressionModel, geneticProgramming

new.alist Create a new function argument list from a list or vector of strings...

Description

Create a new function argument list from a list or vector of strings

Usage

new.alist(fargs)

Arguments

fargs The formal arguments, given as a list or vector of strings.

Details

Creates a formal argument list from a list or vector of strings, ready to be assigned via formals.

Value

A formal argument list, ready to be passed via formals.

Page 42: Package ‘rgp’ · rgp-package The RGP package Description RGP is a simple yet flexible Genetic Programming system for the R environment. The system implements classical untyped

42 nondeterministicRanking

new.function Create a new function stub...

Description

Create a new function stub

Details

Creates and returns a new function stub without capturing any environment variables.

Value

A new function that does not take any arguments and always returns NULL.

Note

Always use this function to dynamically generate new functions that are not clojures to prevent hardto find memory leaks.

nondeterministicRanking

Create a nondeterministic ranking...

Description

Create a nondeterministic ranking

Usage

nondeterministicRanking(l, p=1)

Arguments

l The numer of elements in the ranking.

p The "degree of determinism" of the ranking to create.

Details

Create a permutation of the sequence s = 1:l representing a ranking. If p = 1, the ranking willbe completely deterministic, i.e. equal to 1:l. If p = 0, the ranking will be completely random.If 0 < p < 1, the places in the ranking will be determined by iterative weighted sampling withoutreplacement from the sequence s := 1:l. At each step of this iterated weighted sampling, the firstremaining element of s will be selected with probability p, the second element with probabilityp * (1 - p), the third element with probability p * (1 - p) ^ 2, and so forth.

Page 43: Package ‘rgp’ · rgp-package The RGP package Description RGP is a simple yet flexible Genetic Programming system for the R environment. The system implements classical untyped

normalize 43

Value

A ranking permutation of the values 1:l.

normalize Normalize a vector into the interval [0, 1]...

Description

Normalize a vector into the interval [0, 1]

Usage

normalize(x)

Arguments

x The vector to normalize, so that each element lies in the interval [0, 1].

Value

The normalized vector.

orderByParetoMeasure Rearrange points via an arbitrary Pareto-based ranking...

Description

Rearrange points via an arbitrary Pareto-based ranking

Usage

orderByParetoMeasure(values, measure=crowding_distance)

Arguments

values The value matrix to return the ordering permutation for. Each column representsa point, each row a dimension.

measure The measure used for ranking points that lie on the same Pareto front, defaultsto crowding_distance.

Details

Returns a permutation that rearranges points, given as columns in a value matrix, via Pareto-basedranking. Points are ranked by their Pareto front number, ties are broken by the values of measure.

Value

A permutation to rearrange values based on a Pareto based ranking.

Page 44: Package ‘rgp’ · rgp-package The RGP package Description RGP is a simple yet flexible Genetic Programming system for the R environment. The system implements classical untyped

44 plotFunctions

paretoSorting Rearrange points via Pareto-based rankings...

Description

Rearrange points via Pareto-based rankings

Usage

orderByParetoCrowdingDistance(values)orderByParetoHypervolumeContribution(values)

Arguments

values The value matrix to return the ordering permutation for. Each column representsa point, each row a dimension.

Details

orderByParetoCrowdingDistance: Returns a permutation that rearranges points, given as columnsin a value matrix, via Pareto-based ranking. Points are ranked by their Pareto front number. InorderByParetoCrowdingDistance, ties are then broken by crowding distance, in orderByParetoHypervolumeContribution,ties are broken by hypervolume contribution.

Value

orderByParetoCrowdingDistance: A permutation to rearrange values based on a Pareto basedranking.

plotFunctions Show an overlayed plot of multiple functions...

Description

Show an overlayed plot of multiple functions

Usage

plotFunctions(funcs, from=0, to=1, steps=1024, type="l", lty=1:5, lwd=1,lend=par("lend"), pch, col=1:6, cex, bg=NA, xlab="x", ylab="y",legendpos="bottomright", bty="n", ...)

Page 45: Package ‘rgp’ · rgp-package The RGP package Description RGP is a simple yet flexible Genetic Programming system for the R environment. The system implements classical untyped

plotPopulationFitnessComplexity 45

Arguments

funcs A list of functions of one variable to plot.

from The left bound of the plot, i.e. the minimum x value to plot.

to The right bound of the plot, i.e. the maximum x value to plot.

steps The number of steps, or samples, to plot.

type The plot type (e.g. l = line) as passed on to matplot.

lty The line types as passed on to matplot.

lwd The line widths as passed on to matplot.

lend The line end cap types as passed on to matplot.

pch The plot chars as passed on to matplot.

col The plot colors as passed on to matplot.

cex The character expansion sizes as passed on to matplot.

bg The background (fill) colors as passed on to matplot.

xlab The x axis label as passed on to matplot.

ylab The y axis label as passed on to matplot.

legendpos The position of the legend, passed as the x parameter to legend.

bty The box type parameter of the legend, passed as the bty parameter to legend.

... Grapical parameters for par and further arguments to plot. For example, usethe main parameter to set a title.

Details

Creates and shows and overlayed plot of one or more functions of one variable y = f(x).

Examples

plotFunctions(list(function(x) sin(x), function(x) cos(x), function(x) 0.5*sin(2*x)+1), -pi, pi, 256)

plotPopulationFitnessComplexity

Fitness/Complexity plot for populations...

Description

Fitness/Complexity plot for populations

Usage

plotPopulationFitnessComplexity(pop, fitnessFunction, complexityFunction=funcVisitationLength,showIndices=TRUE, showParetoFront=TRUE, hideOutliers=0, ...)

Page 46: Package ‘rgp’ · rgp-package The RGP package Description RGP is a simple yet flexible Genetic Programming system for the R environment. The system implements classical untyped

46 popfitness

Arguments

pop A population to plot.

fitnessFunction

The function to calculate an individual’s fitness with.complexityFunction

The function to calculate an individual’s complexity with.

showIndices Whether to show the population index of each individual.

showParetoFront

Whether to highlight the pareto front in the plot.

hideOutliers If N = hideOutliers > 0, hide outliers from the plot using a "N * IQR" crite-rion.

... Additional parameters for the underlying call to plot.

Details

Plots the fitness against the complexity of each individual in a population.

popfitness Calculate the fitness value of each individual in a population...

Description

Calculate the fitness value of each individual in a population

Usage

popfitness(pop, fitnessfunc)

Arguments

pop A population of functions.

fitnessfunc The fitness function.

Value

A list of fitness function values in the same order as pop.

Page 47: Package ‘rgp’ · rgp-package The RGP package Description RGP is a simple yet flexible Genetic Programming system for the R environment. The system implements classical untyped

populationClustering 47

populationClustering Clustering Populations for Niching...

Description

Clustering Populations for Niching

Usage

makeHierarchicalClusterFunction(distanceMeasure=normInducedFunctionDistance(exprVisitationLength),minNicheSize=1)

Arguments

distanceMeasure

A distance measure, used for calculating distances between individuals in a pop-ulation.

minNicheSize The minimum number of individuals in each niche.

Details

These functions create clusterFunctions for multiNicheGeneticProgramming and multiNicheSymbolicRegression.makeHierarchicalClusterFunction returns a clustering function that uses Ward’s agglomerativehierarchical clustering algorithm hclust.

Value

A clusterFunction for clustering populations.

See Also

multiNicheGeneticProgramming, multiNicheSymbolicRegression

populationCreation Classes for populations of individuals represented as functions...

Description

Classes for populations of individuals represented as functions

Page 48: Package ‘rgp’ · rgp-package The RGP package Description RGP is a simple yet flexible Genetic Programming system for the R environment. The system implements classical untyped

48 populationCreation

Usage

makePopulation(size, funcset, inset, conset, maxfuncdepth=8, constprob=0.2,breedingFitness=function(individual) TRUE, breedingTries=50,extinctionPrevention=FALSE, funcfactory=function()randfuncRampedHalfAndHalf(funcset, inset, conset, maxfuncdepth,constprob = constprob, breedingFitness = breedingFitness,breedingTries = breedingTries))

makeTypedPopulation(size, type, funcset, inset, conset, maxfuncdepth=8, constprob=0.2,breedingFitness=function(individual) TRUE, breedingTries=50,extinctionPrevention=FALSE, funcfactory=function()randfuncTypedRampedHalfAndHalf(type, funcset, inset, conset,maxfuncdepth, constprob = constprob, breedingFitness =breedingFitness, breedingTries = breedingTries))

print.population(x, ...)summary.population(object, ...)

Arguments

size The population size in number of individuals.

type The (range) type of the individual functions to create.

funcset The function set.

inset The set of input variables.

conset The set of constant factories.

maxfuncdepth The maximum depth of the functions of the new population.

constprob The probability of generating a constant in a step of growth, if no subtree isgenerated. If neither a subtree nor a constant is generated, a randomly choseninput variable will be generated. Defaults to 0.2.

breedingFitness

A breeding function. See the documentation for geneticProgramming for de-tails.

breedingTries The number of breeding steps.

extinctionPrevention

When set to TRUE, initialization will try to prevent duplicate individuals fromoccurring in the population. Defaults to FALSE, as this operation might be ex-pensive with larger population sizes.

funcfactory A factory for creating the functions of the new population. Defaults to Koza’s"ramped half-and-half" initialization strategy.

x The population to print.

object The population to summarize.

... Additional parameters to the print or summary (passed on to their default im-plementation).

Page 49: Package ‘rgp’ · rgp-package The RGP package Description RGP is a simple yet flexible Genetic Programming system for the R environment. The system implements classical untyped

predict.symbolicRegressionModel 49

Details

makePopulation: makePopulation creates a population of untyped individuals, whereas makeTypedPopulationcreates a population of typed individuals. print.population prints the population. summary.populationreturns a summary view of a population.

Value

makePopulation: A new population of functions.

predict.symbolicRegressionModel

Predict method for symbolic regression models...

Description

Predict method for symbolic regression models

Usage

predict.symbolicRegressionModel(object, newdata, model="BEST", detailed=FALSE, ...)

Arguments

object A model created by symbolicRegression.

newdata A data.frame containing input data for the symbolic regression model. Thevariables in object$formula must match column names in this data frame.

model The numeric index of the model function in object$population to use forprediction or "BEST" to use the model function with the best training fitness.

detailed Whether to add metadata to the prediction object returned.

... Ignored in this predict method.

Details

Predict values via a model function from a population of model functions generated by symbolicregression.

Value

A vector of predicted values or, if detailed is TRUE, a list of the following elements: model themodel used in this prediction response a matrix of predicted versus respone values RMSE the RMSEbetween the real and predicted response

Page 50: Package ‘rgp’ · rgp-package The RGP package Description RGP is a simple yet flexible Genetic Programming system for the R environment. The system implements classical untyped

50 randelt

print.sType Prints a sType and returns it invisible.

Description

Prints a sType and returns it invisible.

Usage

print.sType(x, ...)

Arguments

x The sType to print.

... Optional parameters to print are ignored in this method.

randelt Choose a random element from a list or vector...

Description

Choose a random element from a list or vector

Usage

randelt(x, prob)

Arguments

x The vector or list to chose an element from.

prob A vector of probability weights for obtaining the elements of the vector or listbeing sampled.

Details

Returns a unformly random chosen element of the vector or list x.

Value

A uniformly random element of x.

Page 51: Package ‘rgp’ · rgp-package The RGP package Description RGP is a simple yet flexible Genetic Programming system for the R environment. The system implements classical untyped

randomExpressionChilds 51

randomExpressionChilds

Select random childs or subtrees of an expression...

Description

Select random childs or subtrees of an expression

Usage

randchild(expr)randsubtree(expr, subtreeprob=0.1)

Arguments

expr The expression to select random childs or subtrees from.

subtreeprob The probability for randsubtree to select a certain subtree instead of searchingfurther via an recursive call.

Details

randchild: randchild returns a uniformly random direct child of an expression. randsubtreereturns a uniformly random subtree of an expression. Note that this subtree must not be a directchild.

randomExpressionCreation

Creates an R expression by random growth...

Description

Creates an R expression by random growth

Usage

randexprGrow(funcset, inset, conset, maxdepth=8, constprob=0.2, subtreeprob=0.5,curdepth=1)

randexprFull(funcset, inset, conset, maxdepth=8, constprob=0.2)

Page 52: Package ‘rgp’ · rgp-package The RGP package Description RGP is a simple yet flexible Genetic Programming system for the R environment. The system implements classical untyped

52 randomExpressionCreationTyped

Arguments

funcset The function set.

inset The set of input variables.

conset The set of constant factories.

maxdepth The maximum expression tree depth.

constprob The probability of generating a constant in a step of growth, if no subtree isgenerated. If neither a subtree nor a constant is generated, a randomly choseninput variable will be generated. Defaults to 0.2.

subtreeprob The probability of generating a subtree in a step of growth.

curdepth (internal) The depth of the random expression currently generated, used inter-nally in recursive calls.

Details

randexprGrow: Creates a random R expression by randomly growing its tree. In each step ofgrowth, with probability subtreeprob, an operator is chosen from the function set funcset. Theoperands are then generated by recursive calls. If no subtree is generated, a constant will be gen-erated with probability constprob. If no constant is generated, an input variable will be chosenrandomly. The depth of the resulting expression trees can be bounded by the maxdepth parameter.randexprFull creates a random full expression tree of depth maxdepth. The algorithm is the sameas randexprGrow, with the exception that the probability of generating a subtree is fixed to 1 untilthe desired tree depth maxdepth is reached.

Value

randexprGrow: A new R expression generated by random growth.

randomExpressionCreationTyped

Creates an R expression by random growth respecting type con-straints...

Description

Creates an R expression by random growth respecting type constraints

Usage

randexprTypedGrow(type, funcset, inset, conset, maxdepth=8, constprob=0.2,subtreeprob=0.5, curdepth=1)

randexprTypedFull(type, funcset, inset, conset, maxdepth=8, constprob=0.2)

Page 53: Package ‘rgp’ · rgp-package The RGP package Description RGP is a simple yet flexible Genetic Programming system for the R environment. The system implements classical untyped

randomFunctionCreation 53

Arguments

type The (range) type the created expression should have.

funcset The function set.

inset The set of input variables.

conset The set of constant factories.

maxdepth The maximum expression tree depth.

constprob The probability of generating a constant in a step of growth, if no subtree isgenerated. If neither a subtree nor a constant is generated, a randomly choseninput variable will be generated. Defaults to 0.2.

subtreeprob The probability of generating a subtree in a step of growth.

curdepth (internal) The depth of the random expression currently generated, used inter-nally in recursive calls.

Details

randexprTypedGrow: Creates a random R expression by randomly growing its tree. In each stepof growth, with probability subtreeprob, an operator is chosen from the function set funcset.The operands are then generated by recursive calls. If no function of matching range type exists, aterminal (constant or input variable) will be generated instead. If no subtree is generated, a constantwill be generated with probability constprob. If no constant is generated, an input variable willbe chosen randomly. The depth of the resulting expression trees can be bounded by the maxdepthparameter. In contrast to randexprGrow, this function respects sType tags of functions, input vari-ables, and constant factories. Only well-typed expressions are created. randexprTypedFull createsa random full expression tree of depth maxdepth, respecting type constraints. All nodes in the cre-ated expressions will be tagged with their sTypes.

Value

randexprTypedGrow: A new R expression generated by random growth.

randomFunctionCreation

Creates an R function with a random expression as its body...

Description

Creates an R function with a random expression as its body

Usage

randfunc(funcset, inset, conset, maxdepth=8, constprob=0.2,exprfactory=randexprGrow, breedingFitness=function(individual)TRUE, breedingTries=50)

randfuncRampedHalfAndHalf(funcset, inset, conset, maxdepth=8, constprob=0.2,breedingFitness=function(individual) TRUE, breedingTries=50)

Page 54: Package ‘rgp’ · rgp-package The RGP package Description RGP is a simple yet flexible Genetic Programming system for the R environment. The system implements classical untyped

54 randomFunctionCreationTyped

Arguments

funcset The function set.

inset The set of input variables.

conset The set of constant factories.

maxdepth The maximum expression tree depth.

exprfactory The function to use for randomly creating the function’s body.

constprob The probability of generating a constant in a step of growth, if no subtree isgenerated. If neither a subtree nor a constant is generated, a randomly choseninput variable will be generated. Defaults to 0.2.

breedingFitness

A breeding function. See the documentation for geneticProgramming for de-tails.

breedingTries The number of breeding steps.

Value

randfunc: A randomly generated R function.

randomFunctionCreationTyped

Creates a well-typed R function with a random expression as its body...

Description

Creates a well-typed R function with a random expression as its body

Usage

randfuncTyped(type, funcset, inset, conset, maxdepth=8, constprob=0.2,exprfactory=randexprTypedGrow, breedingFitness=function(individual)TRUE, breedingTries=50)

randfuncTypedRampedHalfAndHalf(type, funcset, inset, conset, maxdepth=8, constprob=0.2,breedingFitness=function(individual) TRUE, breedingTries=50)

Arguments

type The range type of the random function to create.

funcset The function set.

inset The set of input variables.

conset The set of constant factories.

maxdepth The maximum expression tree depth.

constprob The probability of generating a constant in a step of growth, if no subtree isgenerated. If neither a subtree nor a constant is generated, a randomly choseninput variable will be generated. Defaults to 0.2.

Page 55: Package ‘rgp’ · rgp-package The RGP package Description RGP is a simple yet flexible Genetic Programming system for the R environment. The system implements classical untyped

randterminalTyped 55

exprfactory The function to use for randomly creating the function’s body.breedingFitness

A breeding function. See the documentation for geneticProgramming for de-tails.

breedingTries The number of breeding steps.

Value

randfuncTyped: A randomly generated well-typed R function.

randterminalTyped Create a random terminal node...

Description

Create a random terminal node

Usage

randterminalTyped(typeString, inset, conset, constprob)

Arguments

typeString The string label of the type of the random terminal node to create.

inset The set of input variables.

conset The set of constant factories.

constprob The probability of creating a constant versus an input variable.

Value

A random terminal node, i.e. an input variable or a constant.

rangeTypeOfType Return the range type if t is a function type, otherwise just return t...

Description

Return the range type if t is a function type, otherwise just return t

Usage

rangeTypeOfType(t)

Page 56: Package ‘rgp’ · rgp-package The RGP package Description RGP is a simple yet flexible Genetic Programming system for the R environment. The system implements classical untyped

56 rgpTestingAndBenchmarking

Arguments

t The type to extract the range type from.

Value

The range type.

rgpTestingAndBenchmarking

Utility functions for testing and benchmarking the RGP system...

Description

Utility functions for testing and benchmarking the RGP system

Usage

rgpBenchmark(fitnessFunction=function(ind) 0, samples=1, time=10, ...)evaluationsPerSecondBenchmark(f, samples=1, time=10, ...)

Arguments

f The function under test.fitnessFunction

The fitness function to pass to the call to geneticProgramming.

samples The number of indpendent measurements to perform, defaults to 1.

time The time in seconds a sample lasts, defaults to 10 seconds.

... Options as passed to the function under test.

Details

rgpBenchmark: rgpBenchmark measures the number of fitness evaluations per second performedby geneticProgramming. A number of samples experiments are performed.

evaluationsPerSecondBenchmark measures the number of times a function can be called persecond in a tight loop.

Value

rgpBenchmark: The number of fitness evaluations per second performed by RGP.

Page 57: Package ‘rgp’ · rgp-package The RGP package Description RGP is a simple yet flexible Genetic Programming system for the R environment. The system implements classical untyped

rmse 57

rmse Root mean squared error (RMSE)...

Description

Root mean squared error (RMSE)

Usage

rmse(x, y)

Arguments

x A numeric vector or list.

y A numeric vector or list.

Value

The RMSE between x and y.

safeGPfunctions Some simple arithmetic and logic functions for use in GP expressions...

Description

Some simple arithmetic and logic functions for use in GP expressions

Usage

safeDivide(a, b)safeSqroot(a)safeLn(a)ln(a)positive(x)ifPositive(x, thenbranch, elsebranch)ifThenElse(x, thenbranch, elsebranch)

Arguments

a A numeric value.

b A numeric value.

x A numeric value.

thenbranch The element to return when x is TRUE.

elsebranch The element to return when x is FALSE.

Page 58: Package ‘rgp’ · rgp-package The RGP package Description RGP is a simple yet flexible Genetic Programming system for the R environment. The system implements classical untyped

58 searchSpaceDefinition

Details

safeDivide: safeDivide a division operator that returns 0 if the divisor is 0. safeLn a naturallogarithm operator that return 0 if its argument is less then 0. ln is the natural logarithm. positivereturns true if its argument is greater then 0. ifPositive returns its second argument if its firstargument is positive, otherwise its third argument. ifThenElse returns its second argument if itsfirst argument is TRUE, otherwise its third argument.

searchSpaceDefinition Functions for defining the search space for Genetic Programming...

Description

Functions for defining the search space for Genetic Programming

Usage

functionSet(..., list)inputVariableSet(..., list)constantFactorySet(..., list)pw(x, pw)hasPw(x)getPw(x, default=1)c.functionSet(..., recursive=FALSE)c.inputVariableSet(..., recursive=FALSE)c.constantFactorySet(..., recursive=FALSE)

Arguments

... Names of functions or input variables given as strings.

list Names of functions or input variables given as a list of strings.

recursive Ignored when concatenating function- or input variable sets.

x An object (function name, input variable name, or constant factory) to tag witha probability pw.

pw A probability weight.

default A default probability weight to return iff no probability weight is associated withan object.

Details

functionSet: The GP search space is defined by a set of functions, a set of input variables, a set ofconstant constructor functions, and some rules how these functions, input variables, and constantsmay be combined to form valid symbolic expressions. The function set is simply given as a setof strings naming functions in the global environment. Input variables are also given as strings.Combination rules are implemented by a type system and defined by assigning sTypes to functions,input variables, and constant constructors.

Page 59: Package ‘rgp’ · rgp-package The RGP package Description RGP is a simple yet flexible Genetic Programming system for the R environment. The system implements classical untyped

selectionFunctions 59

Function sets and input variable sets are S3 classes containing the following fields: $all contains alist of all functions, or input variables, or constant factories. $byRange contains a table of all inputvariables, or functions, or constant factories, indexed by the string label of their sTypes for inputvariables, or by the string label of their range sTypes for functions, or by the string label of theirrange sTypes for constant factories. This field exists mainly for quickly finding a function, inputvariable, or constant factory that matches a given type.

Multiple function sets, or multiple input variable sets, or multiple constant factory sets can becombined using the c function. functionSet creates a function set. inputVariableSet creates aninput variable set. constantFactorySet creates a constant factory set.

Probability weight for functions, input variables, and constants can be given by tagging constantnames, input variables, and constant factory functions via the pw function (see the examples). Thepredicate hasPw can be used to check if an object x has an associated probability weight. Thefunction getPw returns the probability weight associated with an object x, if available.

Value

functionSet: A function set or input variable set.

Examples

# creating an untyped search space description...functionSet("+", "-", "*", "/", "exp", "log", "sin", "cos", "tan")inputVariableSet("x", "y")constantFactorySet(function() runif(1, -1, 1))# creating an untyped function set with probability weights...functionSet(pw("+", 1.2), pw("-", 0.8), pw("*", 1.0), pw("/", 1.0))

selectionFunctions GP selection functions...

Description

GP selection functions

Usage

makeTournamentSelection(tournamentSize=10, selectionSize=ceiling(tournamentSize/2),tournamentDeterminism=1, vectorizedFitness=FALSE)

makeMultiObjectiveTournamentSelection(tournamentSize=30, selectionSize=ceiling(tournamentSize/2),tournamentDeterminism=1, vectorizedFitness=FALSE,rankingStrategy=orderByParetoCrowdingDistance)

makeComplexityTournamentSelection(tournamentSize=30, selectionSize=ceiling(tournamentSize/2),tournamentDeterminism=1, vectorizedFitness=FALSE,rankingStrategy=orderByParetoCrowdingDistance,complexityMeasure=funcVisitationLength)

Page 60: Package ‘rgp’ · rgp-package The RGP package Description RGP is a simple yet flexible Genetic Programming system for the R environment. The system implements classical untyped

60 selectionFunctions

Arguments

complexityMeasure

The function used to measure the complexity of an individual.

tournamentSize The number of individuals to randomly select to form a tournament, defaults to10 in the single-objective case, 30 in the multi-objective case.

selectionSize The number of individuals to return as selected.tournamentDeterminism

The propability p for selecting the best individual in a tournament, must be inthe interval (0.0, 1.0]. The best individual is selected with propability p, thesecond best individual is selected with propability p * (1 - p), the third bestindividual ist selected with propability p * (1 - p)^2, and so on. Note that settingtournamentDeterminism to 1.0 (the default) yields determistic behavior.

vectorizedFitness

If TRUE, the fitness function is expected to take a list of individuals as input andreturn a list of (possible vector-valued) fitnesses as output.

rankingStrategy

The strategy used to rank individuals based on multiple objectives. This functionmust turn a fitness vector (one point per column) into an ordering permutation(similar to the one returned by order). Defaults to orderByParetoCrowdingDistance.

Details

makeTournamentSelection: A GP selection function determines which individuals in a popula-tion should survive, i.e. are selected for variation or cloning, and which individuals of a populationshould be replaced. Single-objective selection functions base their selection decision on scalar fit-ness function, whereas multi-objective selection functions support vector-valued fitness functions.Every selection function takes a population and a (possibly vector-valued) fitness function as re-quired arguments. It returns a list of two tables selected and discarded, with columns indexand fitness each. The returned list also contains a single integer numberOfFitnessEvaluationsthat contains the number of fitness evaluations used to make the selection (Note that in the multi-objective case, evaluating all fitness functions once counts as a single evaluation). The first tablecontains the population indices of the individuals selected as survivors, the second table containsthe population indices of the individuals that should be discarded and replaced. This definition sim-plifies the implementation of steady-state evolutionary strategies where most of the individuals in apopulation are unchanged in each selection step. In a GP context, steady-state strategies are oftenmore efficient than generational strategies.

makeTournamentSelection returns a classic single-objective tournament selection function. makeMultiObjectiveTournamentSelectionreturns a multi-objective tournament selection function that selects individuals based on multipleobjectives. makeComplexityTournamentSelection returns a multi-objective selection functionthat implements the common case of dual-objective tournament selection with high solution qualityas the first objective and low solution complexity as the second objective.

Value

makeTournamentSelection: A selection function.

Page 61: Package ‘rgp’ · rgp-package The RGP package Description RGP is a simple yet flexible Genetic Programming system for the R environment. The system implements classical untyped

smse 61

smse Scaled mean squared error (SMSE)...

Description

Scaled mean squared error (SMSE)

Usage

smse(x, y)

Arguments

x A numeric vector or list.y A numeric vector or list.

Details

Calculates the MSE between vectors after normalizing them into the interval [0, 1].

Value

The SMSE between x and y.

sortBy Sort a vector or list by the result of applying a function...

Description

Sort a vector or list by the result of applying a function

Usage

sortBy(xs, byFunc)

Arguments

xs A vector or list.byFunc A function from elements of xs to numeric.

Details

Sorts a vector or a list by the numerical result of applying the function byFunc.

Value

The result of sorting xs by byfunc.

Page 62: Package ‘rgp’ · rgp-package The RGP package Description RGP is a simple yet flexible Genetic Programming system for the R environment. The system implements classical untyped

62 sortByRanking

sortByRange Tabulate a list of functions or input variables by the range part of theirsTypes...

Description

Tabulate a list of functions or input variables by the range part of their sTypes

Usage

sortByRange(x)

Arguments

x A list of functions or input variables to sort by range sType.

Value

A table of the objects keyed by their range sTypes.

sortByRanking Sort a vector or list via a given ranking...

Description

Sort a vector or list via a given ranking

Usage

sortByRanking(xs, ranking=rank(xs))

Arguments

xs The vector or list to reorder.

ranking The ranking to sort xs by, defaults to rank(xs).

Details

Reorders a vector or list according to a given ranking ranking.

Value

The result of reordering xs by ranking.

Page 63: Package ‘rgp’ · rgp-package The RGP package Description RGP is a simple yet flexible Genetic Programming system for the R environment. The system implements classical untyped

sortByType 63

sortByType Tabulate a list of functions or input variables by their sTypes...

Description

Tabulate a list of functions or input variables by their sTypes

Usage

sortByType(x)

Arguments

x A list of functions or input variables to sort by sType.

Value

A table of the objects keyed by their sTypes.

sortingAlgorithms Sorting algorithms for vectors and lists...

Description

Sorting algorithms for vectors and lists

Usage

insertionSort(xs, orderRelation)

Arguments

xs The vector or list to sort.

orderRelation The orderRelation to sort xs by (defaults to ‘<=‘). This relation by shouldreflexive, antisymetric, and transitive.

Details

These algorithms sort a list or vector by a given order relation (which defaults to <=). insertionSortis a stable O(n^2) sorting algorithm that is quite efficient for very small sets (less than around 20elements). Use an O(n*log(n)) algorithm for larger sets.

Value

The vector or list xs sorted by the order relation orderRelation.

Page 64: Package ‘rgp’ · rgp-package The RGP package Description RGP is a simple yet flexible Genetic Programming system for the R environment. The system implements classical untyped

64 sTypeConstructors

sTypeConstructors Type constructors for types in the Rsymbolic type system...

Description

Type constructors for types in the Rsymbolic type system

Usage

st(baseTypeName)

Arguments

baseTypeName The name of the base sType to create.

Details

st: These functions create types for the Rsymbolic type system, called sTypes from here on. Thesefunctions are used mostly in literal expressions denoting sTypes. st creates a base sType from astring. A base sType is a type without any further structure. Example include st("numeric"),st("character") or st("logical"). %->% creates a function sType, i.e. the type of function,from a vector of argument sTypes and a result sType. A function sType has domain and rangecontaining its argument and result types. Every sType has a string field containing a unambiguousstring representation that can serve as a hash table key. STypes can be checked for equality viaidentical. sObject is the root of the sType hierarchy, i.e. the most general type.

Value

st: The created sType.

See Also

sTypeTags

Examples

st("numeric")list(st("numeric"), st("numeric")) %->% st("logical")is.sType(st("logical"))

Page 65: Package ‘rgp’ · rgp-package The RGP package Description RGP is a simple yet flexible Genetic Programming system for the R environment. The system implements classical untyped

sTypeInference 65

sTypeInference Inference of sTypes...

Description

Inference of sTypes

Usage

calculateSTypeRecursive(x, typeEnvir=rgpSTypeEnvironment, formalsStack=list())getSTypeFromFormalsStack(x, formalsStack)setSTypeOnFormalsStack(x, value, formalsStack)

Arguments

x The object to operate on.

value An sType.

typeEnvir The type environment, containing user-supplied sTypes of building blocks.

formalsStack A stack of formal arguments with their sTypes.

Details

calculateSTypeRecursive: RGP internally infers the sTypes of compound expressions like func-tion applications and function definitions from the sTypes of atomic expressions. The sTypesof building blocks are defined by the user via the %::% operator and are stored in the package-internal global variable rgpSTypeEnvironment. calculateSTypeRecursive calculates the sTypeof the R expression x. SType inference of function definitions relies on a typed stack of for-mal arguments of the function definition to infer the sType for. getSTypeFromFormalsStack andsetSTypeOnFormalsStack get or set the sType of a formal argument x and a formalsStack, re-spectively.

See Also

sTypeConstructors, sTypeTags

sTypeTags Tagging objects with sTypes...

Description

Tagging objects with sTypes

Page 66: Package ‘rgp’ · rgp-package The RGP package Description RGP is a simple yet flexible Genetic Programming system for the R environment. The system implements classical untyped

66 subDataFrame

Usage

sType(x)‘sType<-‘(x, value)hasStype(x)

Arguments

x The object to retreive, check, or set the sType tag for.

value An sType.

Details

sType: Objects may be tagged with sTypes. Type tags are stored in the "sType" attribute. sTypereturns the sType tag for an object. Assign to the result of this function to set the sType tag for anobject (e.g. sType(foo) <- st("integer")). The %::% operator returns its first argument taggedwith the sType given as its second argument, without modifying the sType of its first argument.hasStype returns true if x is tagged with an sType.

See Also

sTypeConstructors, attr

Examples

foo <- "foo"sType(foo) <- st("string")sType(foo)foo %::% st("string")

subDataFrame Select a continuous subframe of a data frame...

Description

Select a continuous subframe of a data frame

Usage

subDataFrame(x, size=1, pos="START")

Arguments

x The data frame to get a subframe from.

size The size ratio of the subframe. Must be between 0 and 1.

pos The position to take the subframe from. Must be "START", "CENTER", or "END".

Page 67: Package ‘rgp’ · rgp-package The RGP package Description RGP is a simple yet flexible Genetic Programming system for the R environment. The system implements classical untyped

summary.geneticProgrammingResult 67

Details

Return a continuous subframe of the data frame x containing size * nrow(x) rows from the start,center, or end.

Value

A subframe of x.

summary.geneticProgrammingResult

Summary reports of genetic programming run result objects...

Description

Summary reports of genetic programming run result objects

Usage

summary.geneticProgrammingResult(object, reportFitness=TRUE, orderByFitness=TRUE, ...)

Arguments

object The genetic programming run result object to report on.

reportFitness Whether to report the fitness values of each individual in the result population.Note that calculating fitness values may take a long time. Defaults to TRUE.

orderByFitness Whether the report of the result population should be ordered by fitness. Thisdoes not have an effect if reportFitness is set to FALSE. Defaults to TRUE.

... Ignored in this summary function.

Details

Create a summary report of a genetic programming result object as returned by geneticProgrammingor symbolicRegression, for example.

See Also

geneticProgramming, symbolicRegression

Page 68: Package ‘rgp’ · rgp-package The RGP package Description RGP is a simple yet flexible Genetic Programming system for the R environment. The system implements classical untyped

68 symbolicRegression

symbolicRegression Symbolic regression via untyped standard genetic programming...

Description

Symbolic regression via untyped standard genetic programming

Usage

symbolicRegression(formula, data, stopCondition=makeTimeStopCondition(5), population,populationSize=100, eliteSize=ceiling(0.1 * populationSize),elite=list(), extinctionPrevention=FALSE, archive=FALSE,genealogy=FALSE, individualSizeLimit=64,penalizeGenotypeConstantIndividuals=FALSE,functionSet=mathFunctionSet, constantSet=numericConstantSet,selectionFunction=makeTournamentSelection(),crossoverFunction=crossover, mutationFunction,restartCondition=makeEmptyRestartCondition(),restartStrategy=makeLocalRestartStrategy(),breedingFitness=function(individual) TRUE, breedingTries=50,errorMeasure=rmse, progressMonitor, verbose=TRUE)

Arguments

formula A formula describing the regression task. Only simple formulas of the formresponse ~ variable1 + ... + variableN are supported at this point intime.

data A data.frame containing training data for the symbolic regression run. Thevariables in formula must match column names in this data frame.

stopCondition The stop condition for the evolution main loop. See makeStepsStopConditionfor details.

population The GP population to start the run with. If this parameter is missing, a new GPpopulation of size populationSize is created through random growth.

populationSize The number of individuals if a population is to be created.

eliteSize The number of elite individuals to keep. Defaults to ceiling(0.1 * populationSize).

elite The elite list, must be alist of individuals sorted in ascending order by their firstfitness component.

extinctionPrevention

When set to TRUE, the initialization and selection steps will try to prevent du-plicate individuals from occurring in the population. Defaults to FALSE, as thisoperation might be expensive with larger population sizes.

archive If set to TRUE, all GP individuals evaluated are stored in an archive list archiveListthat is returned as part of the result of this function.

Page 69: Package ‘rgp’ · rgp-package The RGP package Description RGP is a simple yet flexible Genetic Programming system for the R environment. The system implements classical untyped

symbolicRegression 69

genealogy If set to TRUE, the parent(s) of each indiviudal is stored in an archive genealogyListas part of the result of this function, enabling the reconstruction of the completegenealogy of the result population. This parameter implies archive = TRUE.

individualSizeLimit

Individuals with a number of tree nodes that exceeds this size limit will get afitness of Inf.

penalizeGenotypeConstantIndividuals

Individuals that do not contain any input variables will get a fitness of Inf.functionSet The function set.constantSet The set of constant factory functions.selectionFunction

The selection function to use. Defaults to tournament selection. See makeTournamentSelectionfor details.

crossoverFunction

The crossover function.mutationFunction

The mutation function.restartCondition

The restart condition for the evolution main loop. See makeEmptyRestartCon-dition for details.

restartStrategy

The strategy for doing restarts. See makeLocalRestartStrategy for details.breedingFitness

A "breeding" function. This function is applied after every stochastic opera-tion Op that creates or modifies an individal (typically, Op is a initialization,mutation, or crossover operation). If the breeding function returns TRUE on thegiven individual, Op is considered a success. If the breeding function returnsFALSE, Op is retried a maximum of breedingTries times. If this maximumnumber of retries is exceeded, the result of the last try is considered as the resultof Op. In the case the breeding function returns a numeric value, the breedingis repeated breedingTries times and the individual with the lowest breedingfitness is considered the result of Op.

breedingTries In case of a boolean breedingFitness function, the maximum number of re-tries. In case of a numerical breedingFitness function, the number of breed-ing steps. Also see the documentation for the breedingFitness parameter.Defaults to 50.

errorMeasure A function to use as an error measure, defaults to RMSE.progressMonitor

A function of signature function(population, fitnessfunction, stepNumber, evaluationNumber,bestFitness, timeElapsed)to be called with each evolution step.

verbose Whether to print progress messages.

Details

Perform symbolic regression via untyped genetic programming. The regression task is specifiedas a formula. Only simple formulas without interactions are supported. The result of the sym-bolic regression run is a symbolic regression model containing an untyped GP population of modelfunctions.

Page 70: Package ‘rgp’ · rgp-package The RGP package Description RGP is a simple yet flexible Genetic Programming system for the R environment. The system implements classical untyped

70 tabulateFunction

Value

An symbolic regression model that contains an untyped GP population.

See Also

predict.symbolicRegressionModel, geneticProgramming

tabulateFunction Tabulate a function...

Description

Tabulate a function

Usage

tabulateFunction(f, ...)

Arguments

f The function to tabulate.

... The points to tabulate f at.

Details

Creates a data frame of values for the function f at the domain points given in .... This functionworks like mapply, but returns a data frame that also contains the domain points instead of a simplevector that only contains (range) values of f. When using element names in ..., these names mustmatch the formal parameter names of f.

Value

A data frame of function values of f.

See Also

mapply

Page 71: Package ‘rgp’ · rgp-package The RGP package Description RGP is a simple yet flexible Genetic Programming system for the R environment. The system implements classical untyped

withAttributesOf 71

withAttributesOf Return expr with the attributes of sourceExpr...

Description

Return expr with the attributes of sourceExpr

Usage

withAttributesOf(expr, sourceExpr)

Arguments

expr The expression to receive the attributes of sourceExpr.

sourceExpr The expression to provide the attributes for expr.

Value

A copy of expr, tagged with the attributes and sType of sourceExpr.

Page 72: Package ‘rgp’ · rgp-package The RGP package Description RGP is a simple yet flexible Genetic Programming system for the R environment. The system implements classical untyped

Index

!.stopCondition(evolutionStopConditions), 13

∗Topic packagergp-package, 3

%->% (sTypeConstructors), 64%::% (sTypeTags), 65&.stopCondition

(evolutionStopConditions), 13

AllExpressionNodes(expressionTransformation), 23

AnyExpressionNode(expressionTransformation), 23

arithmeticFunctionSet(defaultGPFunctionAndConstantSets),10

arity, 4arity.primitive, 4

breed (breeding), 5breeding, 5buildingBlock (buildingBlocks), 6buildingBlockq (buildingBlocks), 6buildingBlocks, 6buildingBlockTag (buildingBlockTags), 7buildingBlockTag<- (buildingBlockTags),

7buildingBlockTags, 7

c, 59c.constantFactorySet

(searchSpaceDefinition), 58c.functionSet (searchSpaceDefinition),

58c.inputVariableSet

(searchSpaceDefinition), 58calculateSTypeRecursive

(sTypeInference), 65commonSubexpressions

(expressionSimilarityMeasures),22

constantFactorySet(searchSpaceDefinition), 58

contains (lispLists), 33crossover (expressionCrossover), 18crossoverexpr (expressionCrossover), 18crossoverexprTyped

(expressionCrossover), 18crossoverTyped (expressionCrossover), 18customDist, 7

data.frame, 8, 39, 49, 68dataDrivenGeneticProgramming, 8defaultGPFunctionAndConstantSets, 10differingSubexpressions

(expressionSimilarityMeasures),22

dist, 8do.call.ignore.unused.arguments, 11

embedDataFrame, 11evaluationsPerSecondBenchmark

(rgpTestingAndBenchmarking), 56evolutionRestartConditions, 12evolutionRestartStrategies, 13evolutionStopConditions, 13expLogFunctionSet

(defaultGPFunctionAndConstantSets),10

exprChildrenOrEmptyList, 14exprCount

(expressionComplexityMeasures),15

exprDepth(expressionComplexityMeasures),15

expressionComplexityMeasures, 15expressionComposing, 16expressionCounts, 17expressionCrossover, 18expressionMutation, 19

72

Page 73: Package ‘rgp’ · rgp-package The RGP package Description RGP is a simple yet flexible Genetic Programming system for the R environment. The system implements classical untyped

INDEX 73

expressionNames, 21expressionSimilarityMeasures, 22expressionTransformation, 23exprLabel, 24exprLeaves

(expressionComplexityMeasures),15

exprShapesOfDepth (expressionCounts), 17exprShapesOfMaxDepth

(expressionCounts), 17exprShapesOfMaxSize (expressionCounts),

17exprShapesOfSize (expressionCounts), 17exprSize

(expressionComplexityMeasures),15

exprsOfDepth (expressionCounts), 17exprsOfMaxDepth (expressionCounts), 17exprsOfMaxSize (expressionCounts), 17exprsOfSize (expressionCounts), 17exprToGraph

(gpIndividualVisualization), 29exprToIgraph

(gpIndividualVisualization), 29exprToPlotmathExpr, 25exprVisitationLength

(expressionComplexityMeasures),15

exprVisitationLengthRecursive(expressionComplexityMeasures),15

extractAttributes, 25extractLeafSymbols (expressionNames), 21

fifth (lispLists), 33first (lispLists), 33flatten (listSplittingAndGrouping), 34FlattenExpression

(expressionTransformation), 23formals, 41formatSeconds, 26formula, 8–10, 39, 41, 68, 69fourth (lispLists), 33funcCount

(expressionComplexityMeasures),15

funcDepth(expressionComplexityMeasures),15

funcLeaves(expressionComplexityMeasures),15

funcSize(expressionComplexityMeasures),15

functionSet (searchSpaceDefinition), 58funcToIgraph, 27funcToIgraph

(gpIndividualVisualization), 29funcToPlotmathExpr, 26, 30funcVisitationLength

(expressionComplexityMeasures),15

geneticProgramming, 10, 13, 18, 20, 27, 39,41, 48, 54–56, 67, 70

getPw (searchSpaceDefinition), 58getSTypeFromFormalsStack

(sTypeInference), 65gpIndividualVisualization, 29groupListConsecutive, 38, 40groupListConsecutive

(listSplittingAndGrouping), 34groupListDistributed

(listSplittingAndGrouping), 34

hasBuildingBlockTag(buildingBlockTags), 7

hasPw (searchSpaceDefinition), 58hasStype (sTypeTags), 65hclust, 47

identical, 64ifPositive (safeGPfunctions), 57ifThenElse (safeGPfunctions), 57individualAnalysis, 30inputVariableSet

(searchSpaceDefinition), 58inputVariablesOfIndividual

(individualAnalysis), 30insertionSort (sortingAlgorithms), 63intersperse (listSplittingAndGrouping),

34inversePermutation, 31is.atom (lispLists), 33is.composite (lispLists), 33is.empty (lispLists), 33is.sType, 31

Page 74: Package ‘rgp’ · rgp-package The RGP package Description RGP is a simple yet flexible Genetic Programming system for the R environment. The system implements classical untyped

74 INDEX

iterate, 32

joinElites, 32

legend, 45lispLists, 33listSplittingAndGrouping, 34ln (safeGPfunctions), 57

makeComplexityTournamentSelection(selectionFunctions), 59

makeEmptyRestartCondition, 9, 28, 69makeEmptyRestartCondition

(evolutionRestartConditions),12

makeEvaluationsStopCondition(evolutionStopConditions), 13

makeFitnessDistributionRestartCondition(evolutionRestartConditions),12

makeFitnessStagnationRestartCondition,38, 40

makeFitnessStagnationRestartCondition(evolutionRestartConditions),12

makeFitnessStopCondition(evolutionStopConditions), 13

makeFunctionFitnessFunction, 35makeHierarchicalClusterFunction

(populationClustering), 47makeLocalRestartStrategy, 9, 28, 38, 40,

69makeLocalRestartStrategy

(evolutionRestartStrategies),13

makeMultiObjectiveTournamentSelection(selectionFunctions), 59

makePopulation (populationCreation), 47makeRegressionFitnessFunction, 36makeStepsStopCondition, 9, 28, 37, 39, 40,

68makeStepsStopCondition

(evolutionStopConditions), 13makeTimeStopCondition

(evolutionStopConditions), 13makeTournamentSelection, 9, 28, 38, 40, 69makeTournamentSelection

(selectionFunctions), 59

makeTypedPopulation(populationCreation), 47

MapExpressionLeafs(expressionTransformation), 23

MapExpressionNodes(expressionTransformation), 23

MapExpressionSubtrees(expressionTransformation), 23

mapply, 70mathFunctionSet

(defaultGPFunctionAndConstantSets),10

matplot, 45mse, 36multiNicheGeneticProgramming, 37, 47multiNicheSymbolicRegression, 39, 47mutateChangeDeleteInsert

(expressionMutation), 19mutateChangeLabel (expressionMutation),

19mutateDeleteInsert

(expressionMutation), 19mutateDeleteSubtree

(expressionMutation), 19mutateFunc (expressionMutation), 19mutateFuncTyped (expressionMutation), 19mutateInsertSubtree

(expressionMutation), 19mutateNumericConst

(expressionMutation), 19mutateNumericConstTyped

(expressionMutation), 19mutateSubtree (expressionMutation), 19mutateSubtreeTyped

(expressionMutation), 19

NCSdist (expressionSimilarityMeasures),22

new.alist, 41new.function, 42nondeterministicRanking, 42normalize, 43normalizedNumberOfCommonSubexpressions

(expressionSimilarityMeasures),22

normalizedSizeWeightedNumberOfCommonSubexpressions(expressionSimilarityMeasures),22

Page 75: Package ‘rgp’ · rgp-package The RGP package Description RGP is a simple yet flexible Genetic Programming system for the R environment. The system implements classical untyped

INDEX 75

normInducedFunctionDistance(expressionSimilarityMeasures),22

normInducedTreeDistance(expressionSimilarityMeasures),22

numberOfCommonSubexpressions(expressionSimilarityMeasures),22

numberOfDifferingSubexpressions(expressionSimilarityMeasures),22

numericConstantSet(defaultGPFunctionAndConstantSets),10

order, 31orderByParetoCrowdingDistance

(paretoSorting), 44orderByParetoHypervolumeContribution

(paretoSorting), 44orderByParetoMeasure, 43

par, 45paretoSorting, 44plot, 46plotFunctions, 44plotmath, 25–27plotPopulationFitnessComplexity, 45popfitness, 46populationClustering, 47populationCreation, 47positive (safeGPfunctions), 57predict.symbolicRegressionModel, 41, 49,

70print, 48print.population (populationCreation),

47print.sType, 50pw (searchSpaceDefinition), 58

randchild (randomExpressionChilds), 51randelt, 50randexprFull

(randomExpressionCreation), 51randexprGrow

(randomExpressionCreation), 51randexprTypedFull

(randomExpressionCreationTyped),52

randexprTypedGrow(randomExpressionCreationTyped),52

randfunc (randomFunctionCreation), 53randfuncRampedHalfAndHalf

(randomFunctionCreation), 53randfuncTyped

(randomFunctionCreationTyped),54

randfuncTypedRampedHalfAndHalf(randomFunctionCreationTyped),54

randomExpressionChilds, 51randomExpressionCreation, 51randomExpressionCreationTyped, 52randomFunctionCreation, 53randomFunctionCreationTyped, 54randsubtree (randomExpressionChilds), 51randterminalTyped, 55rangeTypeOfType, 55rank, 31rest (lispLists), 33rgp-package, 3rgpBenchmark

(rgpTestingAndBenchmarking), 56rgpTestingAndBenchmarking, 56rmse, 57

safeDivide (safeGPfunctions), 57safeGPfunctions, 57safeLn (safeGPfunctions), 57safeSqroot (safeGPfunctions), 57searchSpaceDefinition, 58second (lispLists), 33selectionFunctions, 59setSTypeOnFormalsStack

(sTypeInference), 65sizeWeightedNumberOfCommonSubexpressions

(expressionSimilarityMeasures),22

sizeWeightedNumberOfDifferingSubexpressions(expressionSimilarityMeasures),22

smse, 61SNCSdist

(expressionSimilarityMeasures),22

sObject (sTypeConstructors), 64sortBy, 61

Page 76: Package ‘rgp’ · rgp-package The RGP package Description RGP is a simple yet flexible Genetic Programming system for the R environment. The system implements classical untyped

76 INDEX

sortByRange, 62sortByRanking, 62sortByType, 63sortingAlgorithms, 63splitList (listSplittingAndGrouping), 34st (sTypeConstructors), 64sType (sTypeTags), 65sType<- (sTypeTags), 65sTypeConstructors, 64sTypeInference, 65sTypeTags, 65subDataFrame, 66subexpressions (expressionComposing), 16summary, 48summary.geneticProgrammingResult, 29,

39, 67summary.population

(populationCreation), 47symbolicRegression, 10, 29, 39, 49, 67, 68

tabulateFunction, 70third (lispLists), 33toName (expressionNames), 21trigonometricFunctionSet

(defaultGPFunctionAndConstantSets),10

trivialMetric(expressionSimilarityMeasures),22

typedGeneticProgramming(geneticProgramming), 27

withAttributesOf, 71