Difference between revisions of "Tivoli Backup"

From UFRC
Jump to navigation Jump to search
 
(11 intermediate revisions by 3 users not shown)
Line 1: Line 1:
=Standard Backup Retention Policies=
+
{|align=right
 +
  |__TOC__
 +
  |}
 +
==Purchasing Tape Backup==
 +
See [https://hosting.it.ufl.edu/services/backup-archive/ https://hosting.it.ufl.edu/services/backup-archive/] for full details. The cost is $6.50 per month for a TB of data. The amount of space the backup will actually use will depend on how many snapshots and what retention time you select. See the retention policies below. Open a [https://support.rc.ufl.edu Support Request] to set up backup for a /blue or /orange directory tree.
 +
 
 +
==Standard Backup Retention Policies==
 
These are the standard backup policies for users that have contracted with UF-IT's Tivoli backup group to provide backups of their data on Research Computing systems.
 
These are the standard backup policies for users that have contracted with UF-IT's Tivoli backup group to provide backups of their data on Research Computing systems.
==Policy==
+
 
 +
===How Policies are Applied===
 +
The concepts of Tivoli backup are that a "Node" has a policy applied to it. The node can have multiple file spaces backed up to it, but only one policy can be applied to the node. The file space utilization of a node is what the customer is charged for by ICT/NSAM.
 +
 
 +
If you have the need for more than one policy to be applied to different file spaces, multiple nodes will need to be created in order to facilitate the different policies.
 +
===Default Policy Settings===
 +
 
 
{| class="wikitable"
 
{| class="wikitable"
 
|-
 
|-
! Description !! Setting
+
! Policy !! Default Setting !! Description
 
|-
 
|-
| Number of copies held onto || 7 copies will be held onto, unless the following restrictions go into effect
+
| Versions Data Exists || 7 copies || This policy applies to both live and deleted files excluding the last copy of a deleted file. If a file is changed, the old version of the file is held in backups up to the time limit set in '''Retain Extra Versions'''. If more versions of this file are backed up, the oldest version of the file will be dropped from backups.
 
|-
 
|-
| How long to keep old versions of a file || 60 days from writing new version
+
| Versions Data Deleted || 5 copies || If a file is deleted, this setting takes effect on the retained copies of the file. If there were more versions of the file on the system as defined by '''Versions Data Exists''', then oldest extra versions will be dropped from backups.
 
|-
 
|-
| How long to keep a deleted file || 90 days from last backup of file.
+
| Retain Extra Versions || 60 days || The number of days that extra versions of a file will be held in the system.
 +
|-
 +
| Retain Only Version || 90 days || The number of data that the last version of a file ''that has been deleted'' will be held in the system.
 
|}
 
|}
==What the Policy Means==
+
 
* ''Number of copies held onto'' simply refers to the number of copies of any single ''undeleted'' file that is held in the backup system. If you modify a file, when the incremental backups run that new version of the file is backed up, and the old version of the file is kept in the system as a revision. Up to seven copies of this file will be kept until the oldest revision is dropped off. Old versions of a file can also be dropped out of backups due to the next option:
+
==How Files are Stored==
* ''How long to keep old versions of a file'' is how long a file will be kept on the backup system once it has been changed. Only the newest version of the file is kept forever.
+
===Incremental Granularity===
* ''How long to keep a deleted file'' refers to the last backup of a file that has been deleted. Deleted files are treated a little differently from a changed file on the system as it is no longer visible to the user on the main file system, so extra time is given to the user to recognize that a file is missing and request it from backups.
+
Files are backed up to Tivoli at a granularity of the file. What this means is that if a file is changed, the entire file will be backed up again, and the old version of the file will be relegated to "Extra Versions" of the file, as defined by the above policies.
=Requesting a file recovery=
+
 
 +
This can be a costly affair if your files are very large and you change them often, as the extra versions of the file will take up significantly more space. If you have one file that is 10gb in size, and you make a small change in the file, the entire file will be backed up in its entirety.
 +
===The Nitty-Gritty of Tape Backups===
 +
 
 +
Whenever a file is backed up to the Tivoli system, a decision tree is run through:
 +
* If the file is a new file, it is simply backed up to the Tivoli system. This results in a number of different things happening:
 +
** The file data is stored to tape
 +
** The file metadata is stored in a catalog. This includes information about the file itself (size, where it came from, etc.) and where it was stored on tape (which tape, where on the tape, etc.)
 +
* If the file is a file that has been previously backed up, but has been changed, it is backed up to the Tivoli system as well:
 +
** The file data is stored to tape
 +
** Based on the retention policies defined above, the old version of the file is marked for removal from the Tivoli system at a date in the future ('''Retain Extra Versions''')
 +
** If the number of old versions of the file exceeds '''Versions Data Exists''', the oldest version of the file is removed from the catalog, and the tape index is marked to show that the space that file was taking up is no longer being utilized.
 +
* When the backup is run, one other thing that is done is to compare the current state of the filesystem with what the Tivoli system expects to see. If a file is missing, it is considered to be deleted, and the following happens:
 +
** The most recent version of the file in the Tivoli system is marked with an expiration date based on the '''Retain Only Version''' policy setting.
 +
** The number of versions of the file is compared to '''Versions Data Deleted''', and extra versions are removed from the catalog.
 +
 
 +
As time progresses, tapes will have files marked as being removed up to a point where much of the tape no longer has a lot of relevant data left on it. This can be caused by file deletions or older versions of the file that have expired. At a certain threshold, the Tivoli system will read all of the still valid files on the tape and write them to a new tape, updating the catalog with their new locations. Once this operation is done, the tape can be re-used as a new tape.
 +
 
 +
==Requesting a file recovery==
 
If a file needs to be recovered from the Tivoli system, please submit a service request in Research Computing's ticketing system at [http://support.rc.ufl.edu http://support.rc.ufl.edu]
 
If a file needs to be recovered from the Tivoli system, please submit a service request in Research Computing's ticketing system at [http://support.rc.ufl.edu http://support.rc.ufl.edu]
=Notes=
+
 
* On a regular basis with backups, we see backup errors of files that were not backed up due to changing while the backup process was running on them. This typically occurs for two different reasons:
+
==Costs for Backup==
** The file was deliberately modified, either by the user or a job running. If the file is a permanent file, it will be backed up the next time the backup process occurs in one day.
+
The cost for backup to the Tivoli backup system is $78/TB/Year, or $6.50/TB/Month. This is the cost of the data stored on tape, and there are no transfer fees for backing up to tape or recovering data from tape. In addition, the quotas involved on the system are not involved with these backups. The backup space used is based on a combination of the contents of the directories that have been designated by the user, deleted files that are within the retention period, and changed files in the retention period.
** The file is a temporary file being utilized by a job. These files are by nature temporary, so it is actually a good thing that the file is not backed up, as the next time the backup process runs it will not be there, and will not be backed up. Some temporary files are backed up from time to time, only to be marked as ''deleted'' the next time the backup runs. These files will remain in the backup system for 90 days, the length of time it takes for a file to expire after being deleted.
+
===Cost Estimation===
 +
As a general rule of thumb, you can expect to be charged approximately one and a half times the amount of space you are actually taking up on the filesystem due to changed and deleted files being retained for the periods listed above.
 +
==Notes==
 +
On a regular basis with backups, we see backup errors of files that were not backed up due to changing while the backup process was running on them. This typically occurs for two different reasons:
 +
* The file was deliberately modified, either by the user or a job running. If the file is a permanent file, it will be backed up the next time the backup process occurs in one day.
 +
* The file is a temporary file being utilized by a job. These files are by nature temporary, so it is actually a good thing that the file is not backed up, as the next time the backup process runs it will not be there, and will not be backed up. Some temporary files are backed up from time to time, only to be marked as ''deleted'' the next time the backup runs. These files will remain in the backup system for 90 days, the length of time it takes for a file to expire after being deleted.

Latest revision as of 16:29, 10 January 2023

Purchasing Tape Backup

See https://hosting.it.ufl.edu/services/backup-archive/ for full details. The cost is $6.50 per month for a TB of data. The amount of space the backup will actually use will depend on how many snapshots and what retention time you select. See the retention policies below. Open a Support Request to set up backup for a /blue or /orange directory tree.

Standard Backup Retention Policies

These are the standard backup policies for users that have contracted with UF-IT's Tivoli backup group to provide backups of their data on Research Computing systems.

How Policies are Applied

The concepts of Tivoli backup are that a "Node" has a policy applied to it. The node can have multiple file spaces backed up to it, but only one policy can be applied to the node. The file space utilization of a node is what the customer is charged for by ICT/NSAM.

If you have the need for more than one policy to be applied to different file spaces, multiple nodes will need to be created in order to facilitate the different policies.

Default Policy Settings

Policy Default Setting Description
Versions Data Exists 7 copies This policy applies to both live and deleted files excluding the last copy of a deleted file. If a file is changed, the old version of the file is held in backups up to the time limit set in Retain Extra Versions. If more versions of this file are backed up, the oldest version of the file will be dropped from backups.
Versions Data Deleted 5 copies If a file is deleted, this setting takes effect on the retained copies of the file. If there were more versions of the file on the system as defined by Versions Data Exists, then oldest extra versions will be dropped from backups.
Retain Extra Versions 60 days The number of days that extra versions of a file will be held in the system.
Retain Only Version 90 days The number of data that the last version of a file that has been deleted will be held in the system.

How Files are Stored

Incremental Granularity

Files are backed up to Tivoli at a granularity of the file. What this means is that if a file is changed, the entire file will be backed up again, and the old version of the file will be relegated to "Extra Versions" of the file, as defined by the above policies.

This can be a costly affair if your files are very large and you change them often, as the extra versions of the file will take up significantly more space. If you have one file that is 10gb in size, and you make a small change in the file, the entire file will be backed up in its entirety.

The Nitty-Gritty of Tape Backups

Whenever a file is backed up to the Tivoli system, a decision tree is run through:

  • If the file is a new file, it is simply backed up to the Tivoli system. This results in a number of different things happening:
    • The file data is stored to tape
    • The file metadata is stored in a catalog. This includes information about the file itself (size, where it came from, etc.) and where it was stored on tape (which tape, where on the tape, etc.)
  • If the file is a file that has been previously backed up, but has been changed, it is backed up to the Tivoli system as well:
    • The file data is stored to tape
    • Based on the retention policies defined above, the old version of the file is marked for removal from the Tivoli system at a date in the future (Retain Extra Versions)
    • If the number of old versions of the file exceeds Versions Data Exists, the oldest version of the file is removed from the catalog, and the tape index is marked to show that the space that file was taking up is no longer being utilized.
  • When the backup is run, one other thing that is done is to compare the current state of the filesystem with what the Tivoli system expects to see. If a file is missing, it is considered to be deleted, and the following happens:
    • The most recent version of the file in the Tivoli system is marked with an expiration date based on the Retain Only Version policy setting.
    • The number of versions of the file is compared to Versions Data Deleted, and extra versions are removed from the catalog.

As time progresses, tapes will have files marked as being removed up to a point where much of the tape no longer has a lot of relevant data left on it. This can be caused by file deletions or older versions of the file that have expired. At a certain threshold, the Tivoli system will read all of the still valid files on the tape and write them to a new tape, updating the catalog with their new locations. Once this operation is done, the tape can be re-used as a new tape.

Requesting a file recovery

If a file needs to be recovered from the Tivoli system, please submit a service request in Research Computing's ticketing system at http://support.rc.ufl.edu

Costs for Backup

The cost for backup to the Tivoli backup system is $78/TB/Year, or $6.50/TB/Month. This is the cost of the data stored on tape, and there are no transfer fees for backing up to tape or recovering data from tape. In addition, the quotas involved on the system are not involved with these backups. The backup space used is based on a combination of the contents of the directories that have been designated by the user, deleted files that are within the retention period, and changed files in the retention period.

Cost Estimation

As a general rule of thumb, you can expect to be charged approximately one and a half times the amount of space you are actually taking up on the filesystem due to changed and deleted files being retained for the periods listed above.

Notes

On a regular basis with backups, we see backup errors of files that were not backed up due to changing while the backup process was running on them. This typically occurs for two different reasons:

  • The file was deliberately modified, either by the user or a job running. If the file is a permanent file, it will be backed up the next time the backup process occurs in one day.
  • The file is a temporary file being utilized by a job. These files are by nature temporary, so it is actually a good thing that the file is not backed up, as the next time the backup process runs it will not be there, and will not be backed up. Some temporary files are backed up from time to time, only to be marked as deleted the next time the backup runs. These files will remain in the backup system for 90 days, the length of time it takes for a file to expire after being deleted.