next up previous contents
Next: Security Issues Up: The Artifact Previous: Batch Management   Contents

Checkpoint Restart

To minimize runtime loss due to a system crash, network loss, or electrical outage, DAO has the ability to restart a run from checkpoint files written after the last completed round. The data is saved in five files: A small control file containing the latest summary information, two 50K byte files saving the current and best student occupancy data, and two smaller files for the current and best waiting lists. For operating systems supporting renaming operations, we use the OLD-NEW-LST suffixes, otherwise we use generational suffixes. The former maintains only the last two checkpoints, while the latter depends upon the number of rounds.

An internal version number, manually assigned to the objective function, indicates whether the global grade can be verified during a checkpoint restart -- any change to the objective function, including user-designated coefficients, invalidates, and therefore, bypasses the verification.

The checkpoint files are also used by Dorm_View to display the results of a previous batch.

elena s ackley 2002-01-20
download thesis