Since there seemed to be no basis for deriving a single optimal schedule, we developed five annealing schedules: Two static types, one algorithmic and one table-driven; and three dynamic adaptive algorithms, each with its own criteria for reducing temperature and phase, and determining completion. An algorithm options file facilitates experimentation with the SA parameters.
We began our study in DAO version v0.09 with an ECS variant -- starting with a temperature and a phase, after a phase worth of attempts, the temperature is decreased by a configurable factor and the length of the phase doubled: , where by default. The process stops when the temperature reaches a configurable value. Running at high temperatures for a short time and at lower temperatures for longer times produced excellent results at the expense of longer than desired elapsed times -- taking several hours for a single round. This `cadillac' algorithmic schedule (denoted sa0) remains our baseline for comparison.
DAO version 0.10 introduced a table-driven version (sa1) that disassociates the phase and the temperature. Up to sixteen phase-temperature pairs are configurable in the algorithm options file. The extreme flexbility of this table approach has been useful in configuring various test situations, though challenging to refine for production.
In search of easier-to-configure approaches, work on adaptive cooling began in November, 1998 with DAO version v0.17, resulting in two approaches: An `attempt-based' scheme (sa2) and an `accept-based' scheme (sa3). Most recently, as of version v1.05, a `reheating' scheme (sa4) has been developed as well.
The three adaptive versions, similar in structure, try to sense when to change the temperature. In each, a running average of the most-recent one thousand cost differences is kept, where an improved score has a negative cost difference. The attempt-based scheme (sa2) uses all attempts, including the unaccepted attempts that produce no change in the overall score, in the history buffer, while the accept-based approaches (sa3 & sa4) keep only the accepted attempts.
When a phase completes (initially, 2,000 attempted or accepted moves, by default) and the temperature hasn't changed for a configurable number of phases, temperature is computed, based on comparing the standard deviation of the last 1,000 attempted or accepted cost differences to the previous standard deviation. The schemes differ as follows:
STD LastSD | sa2 | sa3 | sa4 |
less than | cool TEMP by | cool TEMP by | the average |
TEMP_DIVISOR | TEMP_DIVISOR | of STD | |
equal to |
cool TEMP by | - | and LastSD |
TEMP_DIVISOR | by | ||
greater than |
- | - | TEMP_DIVISOR |
Inspired by the method of reheating as a function of cost
(RFC) [EFC98], the adaptive scheme (sa4) uses the
standard deviation of the cost changes as the basis for the
temperature. To compensate for the extreme variability of the standard
deviations, we use the average of the standard deviations for the
current and previous phases and divide this average by TEMP_DIVISOR to maintain a downward momentum.
The TEMP_DIVISOR is a configurable parameter set to ten by default. A phase count is used as a delay mechanism to determine the number of phases to wait before trying another temperature adjustment. If the temperature is changed, the phase count starts over at zero, otherwise, each scheme does one of the following:
STD LastSD | sa2 | sa3 | sa4 |
less than | - | `ease' phase test | incr phase-count |
equal to | - | `ease' phase test | incr phase-count |
greater than | incr phase-count | incr phase-count | incr phase-count |
zero | incr phase-count | - | - |
The accept-based adaptive scheme (sa3) also considers `easing' down the phase length -- reducing the number of accepts required before recomputing parameters -- to compensate for the increase in attempts required per accepted move at cooler temperatures. This `ease' test compares the percentage of improved grade differences over the number of accepts to DECIDE_TO_EASE (set to 10 percent by default). If the percentage is less, we reduce the phase length by the ACCEPT_REDUCER (in half by default), so the next decision point is reached after fewer accepts and attempts. The other accept-based schedule (sa4), uses a cost-sensitive temperature instead of `easing', while the attempt-based approach (sa2) also increments the phase count when the standard deviation is zero, as noted in the previous table.
The base case is a temperature less than ENDTEMP (.001 by default). Additionally, the phase value can terminate a round in the accept-based adaptive scheme (sa3), and the table-driven scheme (sa1).
The cooling schedule for each round, per batch, can be limited to one scheme, or rotated among all the schemes beginning with any specified scheme. We keep track of the best grade over a series of rounds until we fail to see a better grade within a pre-configured number of rounds (NOBETTER). At this point, we restore the best state found and switch to a hillclimbing algorithm, as discussed in Section 4.3.7.
Figures 3.1 through 3.3 show the `Temperature vs Attempts' graphs for sample rounds that achieved the lowest overall results per scheme. The two static approaches were the same each semester, while the three dynamic schedules varied. The smooth curve, produced by sa0, took the greatest number of attempts each semester, while the steepest graph with the fewest total attempts each semester was made by sa1. The stair-step curves made by the two adaptive schedules, show sa3 quit before sa2, due to the `easing' process. The reheating adaptive schedule, sa4, was non-monotonic and stopped abruptly after nearly half the attempts taken by the cadillac.
|
Percentage indicators for each semester, from the best of our sample test rounds along side the actual performances, are presented in Table 3.1. Since changes in the objective function and data between DAO runs voids any score comparisons, other indicators, such as `perfect score', students without an error, and `satisfaction', percentage of students receiving one of their requested halls, as well as error counts, may be used instead. Statistical test scores using the five cooling schedules are compared in Section 4.3 (page ).
|