Chapter 2 HPSS Planning66 September 2002 HPSS Installation GuideRelease 4.5, Revision 2storage class reaches the threshold configured in the purge policy for that storage class. Rememberthat simply adding migration and purge policies to a storage class will cause MPS to begin runningagainst the storage class, but it is also critical that the hierarchies to which that storage class belongsbe configured with proper migration targets in order for migration and purge to perform asexpected.The purpose of disk migration is to make one or more copies of data stored in a disk storage classto lower levels in the storage hierarchy. BFS uses a metadata queue to pass migration records to MPS.When a disk file needs to be migrated (because it has been created, modified, or undergone a classof service change), BFS places a migration record on this queue. During a disk migration run on agiven storage class, MPS uses the records on this queue to identify files which are migrationcandidates. Migration records on this queue are ordered by storage hierarchy, file family, and recordcreate time, in that order. This ordering determines the order in which files are migrated.MPS allows disk storage classes to be used atop multiple hierarchies (to avoid fragmenting diskresources). To avoid unnecessary tape mounts, it is desirable to migrate all of the files in onehierarchy before moving on to the next. At the beginning of each run MPS selects a startinghierarchy. This is stored in the MPS checkpoint metadata between runs. The starting hierarchyalternates to ensure that, when errors are encountered or the migration target is not 100 percent, allhierarchies are served equally. For example, if a disk storage class is being used in three hierarchies,1, 2, and 3, successive runs will migrate the hierarchies in the following order: 1-2-3, 3-1-2, 2-3-1, 1-2-3, etc. A migration run ends when either the migration target is reached or all of the eligible filesin every hierarchy are migrated. Files are ordered by file family for the same reason, althoughfamilies are not checkpoints as hierarchies are. Finally, the record create time is simply the time atwhich BFS adds the migration record to the queue, and so files in the same storage class, hierarchy,and family tend to migrate in the order which they are written (actually the order in which the writecompletes).When a migration run for a given storage class starts work on a hierarchy, it sets a pointer in themigration record queue to the first migration record for the given hierarchy and file family.Following this, migration attempts to build lists of 256 migration candidates. Each migration recordread is evaluated against the values in the migration policy. If the file in question is eligible formigration its migration record is added to the list. If the file is not eligible, it is skipped and it willnot be considered again until the next migration run. When 256 eligible files are found, MPS stopsreading migration records and does the actual work to migrate these files. This cycle continues untileither the migration target is reached or all of the migration records for the hierarchy in questionare exhausted.The purpose of disk purge is to maintain a given amount of free space in a disk storage class byremoving data of which copies exist at lower levels in the hierarchy. BFS uses another metadataqueue to pass purge records to MPS. A purge record is created for any disk file which may beremoved from a given level in the hierarchy (because it has been migrated or staged). During a diskpurge run on a given storage class, MPS uses the records on this queue to identify files which arepurge candidates. The order in which purge records are sorted may be configured on the purgepolicy, and this determines the order in which files are purged. It should be noted that all of theoptions except purge record create time require additional metadata updates and can impose extraoverhead on SFS. Also, unpredictable purge behavior may be observed if the purge record orderingis changed with existing purge records in the system until these existing records are cleared. Purgeoperates strictly on a storage class basis, and makes no consideration of hierarchies or file families.MPS builds lists of 32 purge records, and each file is evaluated for purge at the point when its purgerecord is read. If a file is deemed to be ineligible, it will not be considered again until the next purge run.A purge run ends when either the supply of purge records is exhausted or the purge target isreached.