Standard Backup Retention Policies
These are the standard backup policies for users that have contracted with UF-IT's Tivoli backup group to provide backups of their data on Research Computing systems.
How Policies are Applied
The concepts of Tivoli backup are that a "Node" has a policy applied to it. The node can have multiple file spaces backed up to it, but only one policy can be applied to the node. The file space utilization of a node is what the customer is charged for by ICT/NSAM.
If you have the need for more than one policy to be applied to different file spaces, multiple nodes will need to be created in order to facilitate the different policies.
Default Policy Settings
|Versions Data Exists||7 copies||This policy applies to both live and deleted files excluding the last copy of a deleted file. If a file is changed, the old version of the file is held in backups up to the time limit set in Retain Extra Versions. If more versions of this file are backed up, the oldest version of the file will be dropped from backups.|
|Versions Data Deleted||5 copies||If a file is deleted, this setting takes effect on the retained copies of the file. If there were more versions of the file on the system as defined by Versions Data Exists, then oldest extra versions will be dropped from backups.|
|Retain Extra Versions||60 days||The number of days that extra versions of a file will be held in the system.|
|Retain Only Version||90 days||The number of data that the last version of a file that has been deleted will be held in the system.|
How Files are Stored
Files are backed up to Tivoli at a granularity of the file. What this means is that if a file is changed, the entire file will be backed up again, and the old version of the file will be relegated to "Extra Versions" of the file, as defined by the above policies.
This can be a costly affair if your files are very large and you change them often, as the extra versions of the file will take up significantly more space. If you have one file that is 10gb in size, and you make a small change in the file, the entire file will be backed up in its entirety.
The Nitty-Gritty of Tape Backups
Whenever a file is backed up to the Tivoli system, a decision tree is run through:
- If the file is a new file, it is simply backed up to the Tivoli system. This results in a number of different things happening:
- The file data is stored to tape
- The file metadata is stored in a catalog. This includes information about the file itself (size, where it came from, etc.) and where it was stored on tape (which tape, where on the tape, etc.)
- If the file is a file that has been previously backed up, but has been changed, it is backed up to the Tivoli system as well:
- The file data is stored to tape
- Based on the retention policies defined above, the old version of the file is marked for removal from the Tivoli system at a date in the future (Retain Extra Versions)
- If the number of old versions of the file exceeds Versions Data Exists, the oldest version of the file is removed from the catalog, and the tape index is marked to show that the space that file was taking up is no longer being utilized.
- When the backup is run, one other thing that is done is to compare the current state of the filesystem with what the Tivoli system expects to see. If a file is missing, it is considered to be deleted, and the following happens:
- The most recent version of the file in the Tivoli system is marked with an expiration date based on the Retain Only Version policy setting.
- The number of versions of the file is compared to Versions Data Deleted, and extra versions are removed from the catalog.
As time progresses, tapes will have files marked as being removed up to a point where much of the tape no longer has a lot of relevant data left on it. This can be caused by file deletions or older versions of the file that have expired. At a certain threshold, the Tivoli system will read all of the still valid files on the tape and write them to a new tape, updating the catalog with their new locations. Once this operation is done, the tape can be re-used as a new tape.
Requesting a file recovery
If a file needs to be recovered from the Tivoli system, please submit a service request in Research Computing's ticketing system at http://support.rc.ufl.edu
Costs for Backup
The cost for backup to the Tivoli backup system is $78/TB/Year, or $6.50/TB/Month. This is the cost of the data stored on tape, and there are no transfer fees for backing up to tape or recovering data from tape. In addition, the quotas involved on the system are not involved with these backups. The backup space used is based on a combination of the contents of the directories that have been designated by the user, deleted files that are within the retention period, and changed files in the retention period.
As a general rule of thumb, you can expect to be charged approximately one and a half times the amount of space you are actually taking up on the filesystem due to changed and deleted files being retained for the periods listed above.
- On a regular basis with backups, we see backup errors of files that were not backed up due to changing while the backup process was running on them. This typically occurs for two different reasons:
- The file was deliberately modified, either by the user or a job running. If the file is a permanent file, it will be backed up the next time the backup process occurs in one day.
- The file is a temporary file being utilized by a job. These files are by nature temporary, so it is actually a good thing that the file is not backed up, as the next time the backup process runs it will not be there, and will not be backed up. Some temporary files are backed up from time to time, only to be marked as deleted the next time the backup runs. These files will remain in the backup system for 90 days, the length of time it takes for a file to expire after being deleted.