Difference between revisions of "Tivoli Backup"

From UFRC
Jump to navigation Jump to search
 
(4 intermediate revisions by 2 users not shown)
Line 2: Line 2:
 
   |__TOC__
 
   |__TOC__
 
   |}
 
   |}
=Purchasing Tape Backup=
+
==Summary==
See [https://hosting.it.ufl.edu/services/backup-archive/ https://hosting.it.ufl.edu/services/backup-archive/] for full details. The cost is $6.50 per month for a TB of data. The amount of space the backup will actually use will depend on how many snapshots and what retention time you select. See the retention policies below. Open a [https://support.rc.ufl.edu Support Request] to set up backup for a /blue or /orange directory tree.
+
You can safeguard your data on HiPerGator with Tivoli Storage Manager (TSM) Tape Backup with replication to an external site (Atlanta, GA) for disaster recovery. UFIT Research Computing will configure the backup on HiPerGator and act as an intermediary for the UFIT ICT unit that provides the service. The cost is $6.50 per month for 1 TB of data. The amount of space the backup will actually use will depend on the selected number of snapshots (copies) and retention time. Purchasing this service obligates you to cover all costs incurred for the duration of the investment. Monthly bills will be sent for payment. A support ticket is required for backup configuration to specify the directories to be backed up, number of snapshots to be retained, and retention time for snapshots. The defaults are described below. No charges will be applied at the purchase time. Backup service will not start until details are established, backup is configured, and you are notified. Backup service will end when the backup investment expires. In the event you wish to recover backed up files, open a support ticket with the details.
  
=Standard Backup Retention Policies=
+
See [https://it.ufl.edu/hosting/services/backup-and-archive/ https://it.ufl.edu/hosting/services/backup-and-archive/] for more details about this service.
 +
 
 +
==Purchasing Tape Backup==
 +
Use the [https://gravity.rc.ufl.edu/access/purchase-request/hpg-service/ Internal HiPerGator Service Purchase Form] to order this service.
 +
The cost is $6.50 per month for a TB of data. The amount of space the backup will actually use will depend on how many snapshots and what retention time you select. See the retention policies below. Open a [https://support.rc.ufl.edu Support Request] to set up backup for a /blue or /orange directory tree.
 +
 
 +
==Standard Backup Retention Policies==
 
These are the standard backup policies for users that have contracted with UF-IT's Tivoli backup group to provide backups of their data on Research Computing systems.
 
These are the standard backup policies for users that have contracted with UF-IT's Tivoli backup group to provide backups of their data on Research Computing systems.
  
==How Policies are Applied==
+
===How Policies are Applied===
 
The concepts of Tivoli backup are that a "Node" has a policy applied to it. The node can have multiple file spaces backed up to it, but only one policy can be applied to the node. The file space utilization of a node is what the customer is charged for by ICT/NSAM.  
 
The concepts of Tivoli backup are that a "Node" has a policy applied to it. The node can have multiple file spaces backed up to it, but only one policy can be applied to the node. The file space utilization of a node is what the customer is charged for by ICT/NSAM.  
  
 
If you have the need for more than one policy to be applied to different file spaces, multiple nodes will need to be created in order to facilitate the different policies.  
 
If you have the need for more than one policy to be applied to different file spaces, multiple nodes will need to be created in order to facilitate the different policies.  
==Default Policy Settings==
+
===Default Policy Settings===
  
 
{| class="wikitable"
 
{| class="wikitable"
Line 27: Line 33:
 
|}
 
|}
  
=How Files are Stored=
+
==How Files are Stored==
==Incremental Granularity==
+
===Incremental Granularity===
 
Files are backed up to Tivoli at a granularity of the file. What this means is that if a file is changed, the entire file will be backed up again, and the old version of the file will be relegated to "Extra Versions" of the file, as defined by the above policies.  
 
Files are backed up to Tivoli at a granularity of the file. What this means is that if a file is changed, the entire file will be backed up again, and the old version of the file will be relegated to "Extra Versions" of the file, as defined by the above policies.  
  
 
This can be a costly affair if your files are very large and you change them often, as the extra versions of the file will take up significantly more space. If you have one file that is 10gb in size, and you make a small change in the file, the entire file will be backed up in its entirety.
 
This can be a costly affair if your files are very large and you change them often, as the extra versions of the file will take up significantly more space. If you have one file that is 10gb in size, and you make a small change in the file, the entire file will be backed up in its entirety.
==The Nitty-Gritty of Tape Backups==
+
===The Nitty-Gritty of Tape Backups===
  
 
Whenever a file is backed up to the Tivoli system, a decision tree is run through:
 
Whenever a file is backed up to the Tivoli system, a decision tree is run through:
Line 48: Line 54:
 
As time progresses, tapes will have files marked as being removed up to a point where much of the tape no longer has a lot of relevant data left on it. This can be caused by file deletions or older versions of the file that have expired. At a certain threshold, the Tivoli system will read all of the still valid files on the tape and write them to a new tape, updating the catalog with their new locations. Once this operation is done, the tape can be re-used as a new tape.
 
As time progresses, tapes will have files marked as being removed up to a point where much of the tape no longer has a lot of relevant data left on it. This can be caused by file deletions or older versions of the file that have expired. At a certain threshold, the Tivoli system will read all of the still valid files on the tape and write them to a new tape, updating the catalog with their new locations. Once this operation is done, the tape can be re-used as a new tape.
  
=Requesting a file recovery=
+
==Requesting a file recovery==
 
If a file needs to be recovered from the Tivoli system, please submit a service request in Research Computing's ticketing system at [http://support.rc.ufl.edu http://support.rc.ufl.edu]
 
If a file needs to be recovered from the Tivoli system, please submit a service request in Research Computing's ticketing system at [http://support.rc.ufl.edu http://support.rc.ufl.edu]
  
=Costs for Backup=
+
==Costs for Backup==
 
The cost for backup to the Tivoli backup system is $78/TB/Year, or $6.50/TB/Month. This is the cost of the data stored on tape, and there are no transfer fees for backing up to tape or recovering data from tape. In addition, the quotas involved on the system are not involved with these backups. The backup space used is based on a combination of the contents of the directories that have been designated by the user, deleted files that are within the retention period, and changed files in the retention period.
 
The cost for backup to the Tivoli backup system is $78/TB/Year, or $6.50/TB/Month. This is the cost of the data stored on tape, and there are no transfer fees for backing up to tape or recovering data from tape. In addition, the quotas involved on the system are not involved with these backups. The backup space used is based on a combination of the contents of the directories that have been designated by the user, deleted files that are within the retention period, and changed files in the retention period.
==Cost Estimation==
+
===Cost Estimation===
 
As a general rule of thumb, you can expect to be charged approximately one and a half times the amount of space you are actually taking up on the filesystem due to changed and deleted files being retained for the periods listed above.  
 
As a general rule of thumb, you can expect to be charged approximately one and a half times the amount of space you are actually taking up on the filesystem due to changed and deleted files being retained for the periods listed above.  
=Notes=
+
==Notes==
* On a regular basis with backups, we see backup errors of files that were not backed up due to changing while the backup process was running on them. This typically occurs for two different reasons:
+
On a regular basis with backups, we see backup errors of files that were not backed up due to changing while the backup process was running on them. This typically occurs for two different reasons:
** The file was deliberately modified, either by the user or a job running. If the file is a permanent file, it will be backed up the next time the backup process occurs in one day.
+
* The file was deliberately modified, either by the user or a job running. If the file is a permanent file, it will be backed up the next time the backup process occurs in one day.
** The file is a temporary file being utilized by a job. These files are by nature temporary, so it is actually a good thing that the file is not backed up, as the next time the backup process runs it will not be there, and will not be backed up. Some temporary files are backed up from time to time, only to be marked as ''deleted'' the next time the backup runs. These files will remain in the backup system for 90 days, the length of time it takes for a file to expire after being deleted.
+
* The file is a temporary file being utilized by a job. These files are by nature temporary, so it is actually a good thing that the file is not backed up, as the next time the backup process runs it will not be there, and will not be backed up. Some temporary files are backed up from time to time, only to be marked as ''deleted'' the next time the backup runs. These files will remain in the backup system for 90 days, the length of time it takes for a file to expire after being deleted.

Latest revision as of 18:20, 23 August 2024

Summary

You can safeguard your data on HiPerGator with Tivoli Storage Manager (TSM) Tape Backup with replication to an external site (Atlanta, GA) for disaster recovery. UFIT Research Computing will configure the backup on HiPerGator and act as an intermediary for the UFIT ICT unit that provides the service. The cost is $6.50 per month for 1 TB of data. The amount of space the backup will actually use will depend on the selected number of snapshots (copies) and retention time. Purchasing this service obligates you to cover all costs incurred for the duration of the investment. Monthly bills will be sent for payment. A support ticket is required for backup configuration to specify the directories to be backed up, number of snapshots to be retained, and retention time for snapshots. The defaults are described below. No charges will be applied at the purchase time. Backup service will not start until details are established, backup is configured, and you are notified. Backup service will end when the backup investment expires. In the event you wish to recover backed up files, open a support ticket with the details.

See https://it.ufl.edu/hosting/services/backup-and-archive/ for more details about this service.

Purchasing Tape Backup

Use the Internal HiPerGator Service Purchase Form to order this service. The cost is $6.50 per month for a TB of data. The amount of space the backup will actually use will depend on how many snapshots and what retention time you select. See the retention policies below. Open a Support Request to set up backup for a /blue or /orange directory tree.

Standard Backup Retention Policies

These are the standard backup policies for users that have contracted with UF-IT's Tivoli backup group to provide backups of their data on Research Computing systems.

How Policies are Applied

The concepts of Tivoli backup are that a "Node" has a policy applied to it. The node can have multiple file spaces backed up to it, but only one policy can be applied to the node. The file space utilization of a node is what the customer is charged for by ICT/NSAM.

If you have the need for more than one policy to be applied to different file spaces, multiple nodes will need to be created in order to facilitate the different policies.

Default Policy Settings

Policy Default Setting Description
Versions Data Exists 7 copies This policy applies to both live and deleted files excluding the last copy of a deleted file. If a file is changed, the old version of the file is held in backups up to the time limit set in Retain Extra Versions. If more versions of this file are backed up, the oldest version of the file will be dropped from backups.
Versions Data Deleted 5 copies If a file is deleted, this setting takes effect on the retained copies of the file. If there were more versions of the file on the system as defined by Versions Data Exists, then oldest extra versions will be dropped from backups.
Retain Extra Versions 60 days The number of days that extra versions of a file will be held in the system.
Retain Only Version 90 days The number of data that the last version of a file that has been deleted will be held in the system.

How Files are Stored

Incremental Granularity

Files are backed up to Tivoli at a granularity of the file. What this means is that if a file is changed, the entire file will be backed up again, and the old version of the file will be relegated to "Extra Versions" of the file, as defined by the above policies.

This can be a costly affair if your files are very large and you change them often, as the extra versions of the file will take up significantly more space. If you have one file that is 10gb in size, and you make a small change in the file, the entire file will be backed up in its entirety.

The Nitty-Gritty of Tape Backups

Whenever a file is backed up to the Tivoli system, a decision tree is run through:

  • If the file is a new file, it is simply backed up to the Tivoli system. This results in a number of different things happening:
    • The file data is stored to tape
    • The file metadata is stored in a catalog. This includes information about the file itself (size, where it came from, etc.) and where it was stored on tape (which tape, where on the tape, etc.)
  • If the file is a file that has been previously backed up, but has been changed, it is backed up to the Tivoli system as well:
    • The file data is stored to tape
    • Based on the retention policies defined above, the old version of the file is marked for removal from the Tivoli system at a date in the future (Retain Extra Versions)
    • If the number of old versions of the file exceeds Versions Data Exists, the oldest version of the file is removed from the catalog, and the tape index is marked to show that the space that file was taking up is no longer being utilized.
  • When the backup is run, one other thing that is done is to compare the current state of the filesystem with what the Tivoli system expects to see. If a file is missing, it is considered to be deleted, and the following happens:
    • The most recent version of the file in the Tivoli system is marked with an expiration date based on the Retain Only Version policy setting.
    • The number of versions of the file is compared to Versions Data Deleted, and extra versions are removed from the catalog.

As time progresses, tapes will have files marked as being removed up to a point where much of the tape no longer has a lot of relevant data left on it. This can be caused by file deletions or older versions of the file that have expired. At a certain threshold, the Tivoli system will read all of the still valid files on the tape and write them to a new tape, updating the catalog with their new locations. Once this operation is done, the tape can be re-used as a new tape.

Requesting a file recovery

If a file needs to be recovered from the Tivoli system, please submit a service request in Research Computing's ticketing system at http://support.rc.ufl.edu

Costs for Backup

The cost for backup to the Tivoli backup system is $78/TB/Year, or $6.50/TB/Month. This is the cost of the data stored on tape, and there are no transfer fees for backing up to tape or recovering data from tape. In addition, the quotas involved on the system are not involved with these backups. The backup space used is based on a combination of the contents of the directories that have been designated by the user, deleted files that are within the retention period, and changed files in the retention period.

Cost Estimation

As a general rule of thumb, you can expect to be charged approximately one and a half times the amount of space you are actually taking up on the filesystem due to changed and deleted files being retained for the periods listed above.

Notes

On a regular basis with backups, we see backup errors of files that were not backed up due to changing while the backup process was running on them. This typically occurs for two different reasons:

  • The file was deliberately modified, either by the user or a job running. If the file is a permanent file, it will be backed up the next time the backup process occurs in one day.
  • The file is a temporary file being utilized by a job. These files are by nature temporary, so it is actually a good thing that the file is not backed up, as the next time the backup process runs it will not be there, and will not be backed up. Some temporary files are backed up from time to time, only to be marked as deleted the next time the backup runs. These files will remain in the backup system for 90 days, the length of time it takes for a file to expire after being deleted.