Introduction to Stats
EpochX provides what we call the Stats system, to gather data about how a run
is progressing and provide easy access to that data in a convenient form, including
being able to output it. The center of the Stats system in EpochX is the
Stats class. Like the Life class, this class implements the
singleton pattern and so there is only ever one Stats instance which is
obtainable with a call to the static method Stats.get().
Raw data that is generated by the evolutionary algorithm, such as the population at the
end of a generation, gets inserted into the Stats instance. Then other
components are able to easily get a copy of the data with a call to one of the class'
getStat methods. This method takes an instance of Stat, which
is used to lookup whether any data has been stored against that same Stat
instance. So, for this to work, the request for the data must be using the same
Stat instance that was used to put the data in, and so they are made
available as public fields throughout the API.
Dynamic generation
What if we want to provide access to the average fitness from a generation or the average depth of
the programs in a breeding pool? This kind of data introduces an overhead to calculate and store,
and so to generate all these extra statistics without knowing if they are even needed
would have a big performance hit. Fortunately, we have a solution. The Stat
interface defines the method getStatValue(). In the case that the value for
a requested Stat is not already held in memory, then this method is called,
giving the Stat instance itself the opportunity to generate a value. It is
quite common for dependency chains to develop where the generation of one Stat
requests another which must be generated, and where that one requires another etc. As long as
there is no recursion in the dependencies, this works perfectly and results in all intermediate
Stat values being stashed for future requests which removes any unnecessary
performance penalty.
Expiry
The other method that all Stat implementations must implement is the
getExpiryEvent() method which returns a value from the Stats.ExpiryEvent
enum. Possible values include:
ExpiryEvent.RUNExpiryEvent.GENERATIONExpiryEvent.INITIALISATIONExpiryEvent.ELITISMExpiryEvent.POOL_SELECTIONExpiryEvent.CROSSOVERExpiryEvent.MUTATIONExpiryEvent.REPRODUCTION
The ExpiryEvent a Stat returns, designates the event to associate any values for that
Stat with. In practice, the effect of this is that the stats system
only stores the value for that Stat until the start of the next event of that type.
For example, all GENERATION Stat values will be cleared at the start of the following
generation. This means that there is a specific window during which the values are accessible, after which it
is assumed that any interested parties have extracted and processed the stats they need, storing them elsewhere
if necessary. The simplest and most common way of doing this is outputting the values, as described in the
following section.
Generating output
The most common use of the Stats system is to generate output about your runs. A number of methods
are provided on the Stats class to help with this, with the following signatures.
print(Stat ... fields)print(String separator, Stat ... fields)printToStream(OutputStream out, Stat ... fields)printToStream(OutputStream out, String separator, Stat ... fields)
The two print methods are just for convenience to print to the System.out
OutputStream. The effect of calling any of these methods is to retrieve the specified stat
values, and then print the value to the designated output stream, with each stat value separated
with the String given as the separator parameter.
Given that the life of statistics data is short, where should these print method calls be made? The right
time is almost always upon an 'end' event, such as onCrossoverEnd or onGenerationEnd.
At this point, all the data for that operation will be in the Stats manager, and it is
guarenteed that none of it will have been cleared yet. There is, however, no guarentee that stats associated
with other events that are still in progress (like the run, or generation) will be available yet.
But, in general, the framework will try to put the raw data
into the Stats instance as soon as it is available, which means many stats will be be usable
long before the end event. For example, it is completely safe to request the generation number
(StatField.GEN_NUMBER) during an onCrossoverEnd event.
Enough words. Here is a useful idiom you might like to use:
Life.get().addGenerationListener(new GenerationAdapter(){ public void onGenerationEnd() { Stats.get().print(�); } });
This will print statistics to the console at the end of each generation, the same idea will work for printing statistics each run, crossover, elitism etc.
Creating a new Stat
If you are implementing a new operator or other extension, you may find that you have
new data that you would like to make available through the stats system. The easiest way to do this
is to create a public field which is a reference to an anonymous class which extends AbstractStat.
AbstractStat implements the Stat interface and provides defaults. In the simplest
case your Stat might look like this:
public static final Stat MY_NEW_STAT = new AbstractStat(ExpiryEvent.GENERATION) {};
All you need to do then is to add the raw data into the stats manager using the addData method
on the Stats instance, making sure to use your MY_NEW_STAT as the Stat.
You can then withdraw or print this data in the same way that you can any other stat, by referencing it with
this Stat instance. The data will be cleared upon the next occurance of the expiry event, in this
case upon the start of the next generation.
Where it makes sense to do so, you should consider overriding the
AbstractStat's getStatValue method so the statistic is only calculated if required.
Here is a simple example:
/** * Returns an Integer, which is the number of programs with an above average fitness * in the generation. */ public static final Stat NO_ABOVE_AVE = new AbstractStat(ExpiryEvent.GENERATION) { @Override public Object getStatValue() { double ave = (Double) Stats.get().getStat(GEN_FITNESS_AVE); double[] fitnesses = (double[]) Stats.get().getStat(GEN_FITNESSES); int noAboveAve = 0; for (double fit: fitnesses) { if (fit > ave) { noAboveAve++; } } return noAboveAve; } };
It obtains the average fitness, and an array of all fitnesses, and counts how many of the program fitnesses are greater than an average, returning the result as the value. Notice how it depends upon the two other stats.
Next: Hooks
Previous: Life cycle listeners


