Since all derived classes implement the same member functions,
the operations of the member functions are described here.
Refer to the documentation of the base class for a description of
each of member function.
The building block for a strand is defined by a local class
named Locus. A locus object will have two mutation values.
Here is an example of a piece of code that uses a strand iterator
to compute the product of the fitness values of all mutated
loci in a strand sp:
The "combine" operation, which in the usual genome classes
creates a child genotype by combining genes from two parents,
in this case simply draws a random fitness. The distribution
of fitness values is specified by a random number generator
passed as a parameter to the set_parameters() function.
The application should not call run() or step() after
the population is extinct, since the behavior of these functions
is undefined in these situations. Extinction is signified by a
population size of zero, either the return value from a call to
step() or the population size indicated in the ResultBlock
returned by run().
The last parameter to each of the stand-alone functions and
each of the random number generator constructors is an optional
pointer to a seed. If the parameter is missing, or if it is
NULL, a global seed, automatically initialized from the system
clock, is used. The same seed is shared by all stand-alone
functions and all random number generator objects.
Users can create their own seeds and pass
pointers to them if they want to use their seeds instead of
the predefined seed.
Permission to use, copy, and distribute this software in
its entirety for non-commercial purposes and without fee,
is hereby granted, provided that the above copyright notice
and this permission notice appear in all copies and their
documentation.
Software developers, consultants, or anyone else who wishes
to use all or part of the software or its documentation for
commercial purposes should contact the Technology Transfer
Office at the University of Oregon to arrange a commercial
license agreement.
This software is provided "as is" without expressed or
implied warranty of any kind.
Last update: 20 Nov 97 13:32:39
Generation Classes
A generation is a collection of individuals. In the current
version, operations on generations are simply wrappers for
calls to the same operations as implemented in the
ISet class.
Generation
class Generation {
public:
private:
...
}
GenerationParamBlock
Since there are no operating parameters for the Generation class,
the current version of the parameter block is empty.
class GenerationParamBlock {
};
Genome Classes
A genome object holds all the information related to the genetic
fitness of an individual. The class named Genome is
an abstract class that defines the basic operations that must
be implemented for any genome object. An actual genotype object
will be an instance of one of the three derived classes.
See the library documentation for descriptions of the three
derived classes.
Genome
The string "=0" at the end of each member function defined
in this class signify that the class is an abstract class. In other
words, there will never be an object of type Genome. Any
objects that represent genes will be defined by a class
that is derived from Genome, and
every such derived class must implement all of these member functions.
The steps carried out by the derived class functions will
all do the same thing. The only difference is in how they
are implemented: each derived class uses different data
structures to represent a genotype, and the functions defined
for each class must update the corresponding type of data
structure.
class Genome {
protected:
public:
};
GenomeParamBlock
The parameter block defined in the base class has fields that are
common to all genome classes. The derived classes may also
define additional parameters by defining their own parameter
blocks that are derived from a GenomeParamBlock.
class GenomeParamBlock {
public:
mutation_t s; /* s = mean mutation effect */
double u; /* mu = gametic mutation rate */
};
InfiniteGenome
An "infinite genome" allows genes to grow arbitrarily long.
When a locus no longer differentiates any individual, it can be reused, i.e.
it can be reset to 0.0 in all individuals and used as a location
for a new mutation. An internal garbage collection routine,
invisible to user code, performs the garbage collection periodically.
class InfiniteGenome : public Genome {
public:
protected:
...
};
InfGenomeParamBlock
class InfGenomeParamBlock : public GenomeParamBlock {
public:
double d; /* dominance factor */
};
SparseGenome
A "sparse genome" is basically a sparse vector. The genome
length is speficied in advance, and cannot change during the
simulation. The genome length can be quite large, but the representation
is efficient because only loci that contain mutations are
actually stored in memory.
Thus the amount of space occupied by a genotype is
proportional to the number of mutations. The data structure used
to hold the sparse vector is known as a strand.
A genotype will consist of one or more strands, and the combine
function will combine the strands independently, implementing
a chromosome structure for the genes.
class SparseGenome : public Genome {
public:
protected:
...
};
SparseGenomeParamBlock
The additional parameters needed for a sparse genome define the
variation in mutation effects (the mean mutation effect is
specified in the base class parameter s) and
parameters that define the overall genome length, the
number of chromosomes, and the probability of a cross-over
during recombination.
class SparseGenomeParamBlock : public GenomeParamBlock {
public:
mutation_t sds; /* std deviation of s (s = mutation effect) */
int gl; /* G = number of loci in genome */
int nchromosomes; /* N = number of haploid chromosomes */
double maplength; /* M = genetic map length (unit = Morgans) */
};
Strand
A strand is used to hold the mutations in a single chromosome.
Conceptually a strand is simply a linked list that chains
together loci. The current implementation uses segmented
vectors to provide better locality of reference and faster
access.
class Strand {
friend class StrandIter;
public:
private:
...
};
StrandIter
A StrandIter object is an "iterator" object for strands.
Use it to traverse a strand
in order, stepping from the first non-zero locus to the last.
The iterator knows how the strand is put together, so it can
move efficiently from one locus to the next and skip over
non-zero loci.
for (StrandIter si(sp); si < max; ++si)
w *= sp.get_fitness(si);
The name of the iterator is si. Note that the
argument of the iterator constructor is a reference to the strand object
that the iterator will traverse.
When the iterator is created, its
value is the index of the first non-zero locus in the strand. The operation
++si will set si to the index of the next non-zero
locus in the strand. max is the length of the strand,
i.e. a value higher than any locus index.
class StrandIter {
public:
private:
...
};
VirtualGenome
A "virtual" genome is simply a single fitness value that represents
the total overall fitness of the genotype. Individual loci are
not represented explicitly.
class VirtualGenome : public Genome {
public:
protected:
...
};
VirtualGenomeParamBlock
The only operating parameter for the virtual genome class is
the random number generator object to use for drawing the fitness
of new genotypes.
class VirtualGenomeParamBlock : public GenomeParamBlock {
public:
RNG *R;
};
Fitness Functions
Individual Classes
An individual consists of a genotype, a sex, and an ID. Although
this simple representation is unlikely to suffice for all but the
simplest simulations, this class can be used as a base class for
deriving more complex individuals.
Individual
class Individual {
public:
protected:
...
};
IndividualParamBlock
Since there are no operating parameters for the Individual class,
the current version of the parameter block is empty.
class IndividualParamBlock {
};
ISet
An ISet is a container object for the Set class. It implements
a simple ordered set, allowing set elements to be referenced by
index. If the set currently holds n items, the items
are indexed from 0 to n-1.
class ISet {
public:
private:
...
};
Population Classes
Populations are collections of generations. The simple population
defined in the library has just two generations -- a "current"
generation of reproducing adults and a "new" generation of offspring
produced by the current generation. As is the case with Individual
and Generation, the Population class is too simple to be used in
any but the simplest simulations, but it can be used as a base class
for more complicated populations.
Population
class Population {
public:
protected:
...
};
The run() and step() procedures can be used interchangeably.
After they return, the population is left in a state where the simulation
can be resumed by a subsequent call. For example, an application might
define a derived type that has member functions for visualizing the state
of the population. A program might call step() two times so the
user can view two generations, then call run(100) to advance the
simulation to year 100; in this case a call to run(100) simulates
98 more years, and is equivalent to calling step() 98 more
times, assuming the population is not extinct before year 100.
ResultBlock
class ResultBlock {
public:
int ngen; /* number of generations simulated */
int nsur; /* current size of the population */
};
PopulationParamBlock
class PopulationParamBlock {
public:
double kmax; /* K = mean carrying capacity */
double sdk; /* std deviation of kmax */
double rmax; /* R = mean reproductive rate */
double sdr; /* std deviation of rmax */
double u; /* mu = gametic mutation rate */
};
Random Number Generator Classes
The GSL library includes several random number generators. There
are two ways to draw a sample from a random distribution: call
a stand-alone function for that distribution, or create a
random number generator object and use an operator to advance
the object to the next number in its sequence.
RNGSeed
The random number generators are built on top of the rand48
collection of random number functions that are part of most Unix
system libraries. Seeds for these functions are 48 bits long.
The RNGSeed class gives users a way to create a seed from an
integer and to examine the state of a seed.
class RNGSeed {
public:
};
RNG Base Class
The operations defined for the base class are implemented
in every random number generator object created by a derived
class.
class RNG {
public:
};
Binomial Distribution
A value from a binomial distribution will be
an integer valued floating point number between 0 and n
corresponding to the number of successes in n trials, where each
trial has probability p.
double rbinomial(double p, int n, RNGSeed *s = NULL);
class BinomialRNG : public RNG {
public:
BinomialRNG(double p, int n, RNGSeed *s = NULL);
double operator ++();
private:
...
};
Distributions Based on Cumulative Density Functions
A value from a CDF distribution will have a probability that is
defined by a probability density function supplied by the user.
The PDF is contained in a file that is read by the class
constructor and used to create an internal cumulative density
function, which is in turn used whenever the user requests
a new value. See "Arbitrary Distributions"
in the GSL User Manual for information on the file format
and the types of distributions supported.
typedef enum {CDF_OK, CDF_OPEN_ERR, CDF_FORMAT_ERR} CDFStatus;
class CDFRNG : public RNG {
public:
private:
...
};
Exponential Distribution
A value from an exponential distribution will be
a positive real value with exponentially decreasing probability
of higher values.
double rexponential(RNGSeed *s = NULL);
class ExponentialRNG : public RNG {
public:
ExponentialRNG(RNGSeed *s = NULL);
double operator ++();
};
Gamma Distribution
Implementation of a Gamma distribution with mean a and standard deviation
b. Note: when a == b the distribution is an exponential
distribution; when a < b the distribution is more L-shaped,
and when a > b the distribution is similar to a log-normal distribution.
double rgamma(double a, double b, RNGSeed *s = NULL);
class GammaRNG : public RNG {
public:
GammaRNG(double a, double b, RNGSeed *s = NULL);
double operator ++();
private:
...
};
LogNormal Distribution
There are two ways to specify a log-normal distribution. The first
("lognormal") is used when the mean and standard deviation are
specified in the normal scale, and the second ("lognormallog")
is used when the mean and standard deviation are specified on
the log scale.
double rlognormal(double mean, double stddev, RNGSeed *s = NULL);
double rlognormallog(double mean, double stddev, RNGSeed *s = NULL);
class LogNormalRNG : public RNG {
public:
LogNormalRNG(double mean, double stddev, RNGSeed *s = NULL);
~LogNormalRNG();
double operator ++();
private:
...
};
class LogNormalLogRNG : public RNG {
public:
LogNormalLogRNG(double mean, double stddev, RNGSeed *s = NULL);
~LogNormalLogRNG();
double operator ++();
private:
...
};
Normal Distribution
A value from this distribution will be normally
distributed with mean mean
and standard deviation stddev.
double rnormal(double mean, double stddev, RNGSeed *s = NULL);
class NormalRNG : public RNG {
public:
NormalRNG(double mean, double stddev, RNGSeed *s = NULL);
double operator ++();
private:
...
};
Poisson Distribution
A value from a Possion distribution is an integer-valued floating point number
with expected value lambda and exponentially decreasing probability
of higher values.
double rpoisson(double lambda, RNGSeed *s = NULL);
class PoissonRNG : public RNG {
public:
PoissonRNG(double lambda, RNGSeed *s = NULL);
double operator ++();
private:
...
};
Uniform Distribution
This distribution consists of real numbers evenly distributed
between lower and upper.
double runiform(double lower, double upper, RNGSeed *s = NULL);
class UniformRNG : public RNG {
public:
UniformRNG(double lower, double upper, RNGSeed *s = NULL);
double operator ++();
private:
...
};
Random Words
Statistics Classes
The Statistics class is used to compute descriptive statistics
for a set of data points.
Statistics
A Statistics object works like the stat functions in a hand-held
calculator. To compute statistics for a set of data points,
create a Statistics object, and then record the values one at a time
in the object. After the last value has been recorded, call
get_results(); the mean, standard deviation, etc. will
be returned in a StatsBlock object.
class Statistics {
public:
protected:
...
};
StatsBlock
A StatsBlock is simply a record structure with fields for
each statistic computed by a Statistics object.
class StatsBlock {
public:
int n; /* number of observations */
double mean; /* mean */
double sd; /* standard deviation */
double cv; /* coefficient of variation */
int min; /* minimum value */
int max; /* maximum value */
};
Copyright © 1997 by the University of Oregon.
ALL RIGHTS RESERVED.