Difference between revisions of "Globus"

From UFRC
Jump to navigation Jump to search
 
(10 intermediate revisions by 2 users not shown)
Line 1: Line 1:
 
[[Category:Software]][[Category:File Management]]
 
[[Category:Software]][[Category:File Management]]
 +
{|align=right
 +
  |__TOC__
 +
  |}
 
[https://www.globus.org/ Globus] is an easy-to-use, high-performance data transfer tool developed by the Computation Institute, the University of Chicago and Argonne National Laboratory. UF Research Computing has deployed Globus as one mechanism to facilitate data transfer to and from HiPerGator.
 
[https://www.globus.org/ Globus] is an easy-to-use, high-performance data transfer tool developed by the Computation Institute, the University of Chicago and Argonne National Laboratory. UF Research Computing has deployed Globus as one mechanism to facilitate data transfer to and from HiPerGator.
  
 
Globus uses a grid-FTP network which uses the power of multiple servers to simultaneously transfer data.
 
Globus uses a grid-FTP network which uses the power of multiple servers to simultaneously transfer data.
  
UFRC maintains a managed Globus setup with multiple servers for the highest available bandwidth and filesystem throughput. The following managed endpoints are available at UFRC:
+
UFRC maintains a managed Globus setup with multiple servers for the highest available bandwidth and filesystem throughput. The HPG managed endpoint is 'UFRC HiperGator'
* Globus Version 4: 'ufrc#hpg2'
 
* Globus Version 5: 'UFRC HiperGator'
 
'''Note:''' The 'ufrc#hpg2' endpoint name can no longer be used with the [https://globus.org https://globus.org] file browser.  Use endpoint 'UFRC HiperGator' instead. The error message reads: "Activation failed: Activate of endpoint '[...]' failed: MyProxy credential is expired or doesn't meet minimum lifetime"
 
  
== Getting Started ==
+
[[File:Globus_v5_hpg_collection.png]]
===Getting a Globus Account===
+
= Getting Started =
 +
{{Note|At this time federated users can only upload data to HPG if a UF-based user creates a writable guest collection for them|warn}}
 +
==Getting a Globus Account==
 
Globus will redirect you to UF GatorLink authentication when you log into [https://globus.org Globus.org]. If you created your globus.org account before Globus started using the new authentication you'll need to [https://docs.globus.org/how-to/link-to-existing/ link] your old globus @globusid.org account with your @ufl.edu account.
 
Globus will redirect you to UF GatorLink authentication when you log into [https://globus.org Globus.org]. If you created your globus.org account before Globus started using the new authentication you'll need to [https://docs.globus.org/how-to/link-to-existing/ link] your old globus @globusid.org account with your @ufl.edu account.
  
===Logging in to Globus CLI===
+
==Sharing Data: Quick Start==
Before you can use the Globus CLI on a given computer, you must log in to Globus using the <code>globus login</code> command on that computer. The command will print a URL (similar to the one below) in the shell:
+
[[Image:compactglobus.png | frameless | right | upright=1.25]]
 +
If you just want to know how to share a directory on one of the HPG filesystems here is a very brief procedure:
 +
* Go to [https://globus.org https://globus.org] and log in with your GatorLink credentials after choosing 'University of Florida' as the authentication provider.
 +
* Type or paste 'UFRC HiPerGator' into the collection search box.
 +
* Type or paste the shortest path to the directory tree you want to be in E.g. '/blue/mygroup' or '/orange/mygroup' or the full path to the directory you want to share. You may have to give the 'Globus Application' permission to access files in your browser at this point.
 +
* Browse into the directory you want to share if you didn't paste the full path above.
 +
* Click on the 'Share' menu item.
 +
* Give the share a name and click on 'Create Collection'. A 'Shared Collection' is now set up, but nobody except you has access, yet! If you're having issues creating a collection, use the following link to proceed: [https://app.globus.org/file-manager/collections/5dbaf795-8a7e-4dca-91aa-6e10d610c2b3/shares?filter=%5B%5D Create New Collection].
 +
'''Note:''' Everyone you want to give permission via globus has to log into Globus first, so their Globus account would get created and they could tell you their globus identity (username).
 +
* To give globus users permissions to read or write to that directory click on 'Permissions' > 'Add Permissions - Share With' icon and search for or type/paste a globus username.
 +
* Choose Read, Write, or both for the permission as appropriate.
 +
* Add a message that will be sent to the target user by email if you want.
 +
* Click on 'Add Permission' button.
 +
''Do not change the path or switch from 'user - share with specific individuals'. If there is a 'Globus' group (we have to set up one if you really need multiple users in a single group) in existence then you can use a group. Otherwise, do the sharing with particular globus users, which are not necessarily HiPerGator users, so don't confuse Globus users/groups with HPG users/groups.
 +
 
 +
That's all!
 +
 
 +
=Transferring Data=
 +
For extended instructions on transferring data using Globus, view instructions at [[Transfer Data with Globus]]
 +
 
 +
==Introduction==
 +
A Globus transfer happens between two ''endpoints'' or a source and a target of a transfer. From the point of view of a user transferring data the endpoints are generally called 'Collections'. There are two levels of collections. One level defines a particular globus setup and is asociated with a computing resource such as a server, cluster, storage system, desktop, laptop, or other system globus is installed on. For example HiPerGator Globus set up has a 'Managed Collection' called 'UFRC HiPerGator' that allows you access to HPG filesystems you are authorized to access. An install of Globus Connect Personal on your laptop or desktop, on the other hand, would require setting up a 'Personal Collection' and giving it a name that can be searched for in the globus interface on [https://globus.org https://globus.org]. Since a personal system is only for your use the 'Personal Collection' is all you set up in a GCP install. However, under a 'Managed Collection' like 'UFRC HiPerGator' every user can set many 'Guest Collection' instances on their filesystem directories. The guest collections become endpoints for data transfers whether for you or for other globus users you give read or write access to.
 +
 
 +
==UF Research Computing Globus Setup==
 +
We manage a multi-server high-throughput Globus install on HiPerGator under the umbrella of a managed Globus collection named 'UFRC HiPerGator' that allows you to transfer data right away or to set up any number of guest collections on your directories in /home, /blue, and /orange filesystems. The 'UFRC HiPerGator' collection can be found by searching within the Globus File Explorer interface on [https://globus.org https://globus.org].
 +
 
 +
A valid HiPerGator user account is required to access 'UFRC HiPerGator' collection directly. However,  any globus user you authorized to access a guest collection created under 'UFRC HiPerGator' can transfer data to or from that shared collection without having a HPG account as they are identified by their globus ID that you gave a permission to access a guest collection to.
 +
 
 +
If you do not have a Research Computing account, you may [http://www.rc.ufl.edu/help/account-request/ request one here]. Please note that your username and password to activate 'UFRC HiPerGator' collection are your GatorLink credentials. At this time Federated Users cannot directly access the 'UFRC HiPerGator' collection. If you are having problems with your password, please submit a support request through the [http://helpdesk.ufl.edu/ UF Computing Help Desk] or visit [https://account.it.ufl.edu/ https://account.it.ufl.edu/] for self-service e.g. to reset your GatorLink password.
 +
 
 +
==Globus Connect Personal==
 +
Globus provides users with [https://www.globus.org/globus-connect-personal Globus Connect Personal] software they can use to set up a personal collection on their computing device (Windows, MacOS, Linux) for the purpose of performing data transfers from or to that device.
 +
 
 +
You will be asked to name a personal globus shared collection while instaling GCP. Once created, that collection can be found by name when searching within the Globus File Explorer interface on [https://globus.org https://globus.org]. Once run, GCP allows you to configure which directories on your personal device it has access to and what kind of access. Initially, only you can transfer data to or from a personal computing device running GCP. However, if you are a 'Globus Plus' user status, which all University of Florida members already have because UFRC has a Globus License, you can give access to your  personal collection on your personal computing device running GCP to other Globus users by creating Shared Collections the way you would do it with a guest collection on HPG. Be very careful here to avoid mistakes while giving other people access to files on your personal computing device using this functionality. See [https://globus.stanford.edu/client/plus.html Client Plus] for a nice write-up on this functionality.
 +
 
 +
Once GCP is running and configured you will use the Globus File Explorer interface at [https://globus.org https://globus.org] for making transfers using your personal collection presented by the GCP instance on your device. Think of GCP as a personal 'Globus Server' and https://globus.org as the interface for making transfers.
 +
 
 +
===Automation===
 +
You can automate Globus transfers to avoid having to run GCP personally. See [https://docs.globus.org/how-to/automate-with-service-account/ https://docs.globus.org/how-to/automate-with-service-account/] for details. Let us know if you run into any issues.
 +
 
 +
===UFRC Globus Group===
 +
'''Note'''<nowiki>:</nowiki> this step is not needed if you are only creating and using guest collections created under a managed 'UFRC HiPerGator' collection! This is only needed for creating shared collections on your personal computing device running GCP!
 +
 
 +
To be able to create shared end-points (see below) on your local computer running GCP, if the transfer will be made to a '''non-managed endpoint''' e.g. another local computer with a personal end-point and you are not a University of Florida member you need to have Globus Plus User status. To obtain that status please log into the [https://www.globus.org Globus Interface], click on ''Groups'' menu at the top and select ''Search For Groups''. Search for 'University of Florida Research Computing' and access request to the group. Once approved, you will have Globus Plus User status when running 'Globus Connect Personal' software on your local computer.
 +
 
 +
Direct link to the groups form: [https://app.globus.org/groups https://app.globus.org/groups]
 +
 
 +
===Globus CLI===
 +
If you need to initiate Globus transfers from the command line e.g. as a part of a workflow you may want to use the [https://docs.globus.org/cli/ Globus Command-Line Interface (CLI)]. However, before you can initiate globus transfers with the Globus CLI on a given computer, you will have to log in to Globus using the <code>globus login</code> command. If you're in an X11 environment a web browser will be started for you to perform authentication. This happens if you're, for example, are in a terminal in OnDemand console or desktop session and run
 +
$ module load globus
 +
$ globus login
 +
 
 +
If you're in a 'headless' environment where GUI programs cannot run use the --no-local-server argument.
 +
 
 +
Example:
 +
 
 
<pre>
 
<pre>
$ globus login
+
$ globus login --no-local-server
 
Please log into Globus here:
 
Please log into Globus here:
 
---------------------------
 
---------------------------
Line 34: Line 91:
 
You will be taken to a page where an authorization code is displayed. Copy the code to your computer's clipboard.
 
You will be taken to a page where an authorization code is displayed. Copy the code to your computer's clipboard.
  
Paste the authorization code into the shell where you ran the "globus login" command, as a response to its prompt.
+
Paste the authorization code into the shell where you ran the "globus login --no-local-server" command, as a response to its prompt.
 
<pre>
 
<pre>
 
Enter the resulting Authorization Code here: 1zIOeE6leaZpnk0Fxfmi1J8UauARmF
 
Enter the resulting Authorization Code here: 1zIOeE6leaZpnk0Fxfmi1J8UauARmF
Line 47: Line 104:
 
</pre>
 
</pre>
 
You will now be able to issue Globus CLI commands on this computer. You can check your Globus login status at any time with the command <code>globus whoami</code>.
 
You will now be able to issue Globus CLI commands on this computer. You can check your Globus login status at any time with the command <code>globus whoami</code>.
 
 
===Sharing Quick Start===
 
[[Image:compactglobus.png | frameless | right | upright=1.25]]
 
If you just want to know how to share a directory on one of the HPG filesystems here is a very brief procedure:
 
* Go to [https://globus.org https://globus.org] and log in with your GatorLink credentials after choosing 'University of Florida' as the authentication provider.
 
* Type or paste 'UFRC HiPerGator' into the collection search box.
 
* Type or paste the shortest path to the directory tree you want to be in E.g. '/blue/mygroup' or '/orange/mygroup' or the full path to the directory you want to share. You may have to give the 'Globus Application' permission to access files in your browser at this point.
 
* Browse to the directory you want to share if you didn't paste the full path above.
 
* Click on the 'Share' menu item.
 
* Give the share a name and click on 'Create Collection'. A 'Shared Collection' is now set up, but nobody except you has access, yet.
 
'''Note:''' Everyone you want to give permission via globus has to log into Globus first, so their Globus account would get created and they could tell you what it is.
 
* To give globus user(s) permissions to read or write to that directory click on 'Permissions' > 'Add Permissions - Share With' icon and search for or type/paste a globus username.
 
* Choose Read, Write, or both as needed.
 
* Add a message that will be sent to the target user by email if you want.
 
* Click on 'Add Permission' button.
 
''Do not change the path or switch from 'user - share with specific individuals'. If there is a 'Globus' group (we have to set up one if you really need multiple users in a single group) in existence then you can
 
use a group. Otherwise, do the sharing with particular globus users, which are not necessarily HiPerGator users, so don't confuse Globus users/groups with HPG users/groups.
 
 
That's all!
 
 
===UFRC Globus Group===
 
'''Note'''<nowiki>:</nowiki> this is not needed if you are transferring to a collection created under a managed endpoint. To be able to create shared end-points (see below) on your local computer if the transfer will be happening to a '''non-managed endpoint''' e.g. another local computer with a personal end-point you will need to have Globus Plus User status. To obtain that status please log into the [https://www.globus.org Globus Interface], click on ''Groups'' menu at the top and select ''Search For Groups''. Search for 'University of Florida Research Computing' and access request to the group. Once approved, you will have Globus Plus User status when running 'Globus Connect Personal' software on your local computer.
 
 
===Globus Endpoints===
 
Globus transfers files between two ''endpoints'' or shared collections created under the endpoints. An endpoint is one of the two file transfer locations – either the source or the destination – between which files can move. Once a resource (server, cluster, storage system, laptop, or other system) is defined as an endpoint, it will be available to authorized users who can transfer files to or from this endpoint.
 
 
====UF Research Computing Endpoints====
 
The managed Globus endpoints for /home, /blue, and /orange filesystems are
 
* Globus Version 4: 'ufrc#hpg2'
 
* Globsus Version 5: 'UFRC HiperGator'
 
'''Note:''' The 'ufrc#hpg2' endpoint name can no longer be used with the [https://globus.org https://globus.org] file browser.  Use endpoint 'UFRC HiperGator' instead. The error message reads: "Activation failed: Activate of endpoint '[...]' failed: MyProxy credential is expired or doesn't meet minimum lifetime"
 
 
A valid UF Research Computing account is required to access these endpoints. If you do not have a Research Computing account, you may [http://www.rc.ufl.edu/help/account-request/ request one here]. Please note that your username to activate a Globus endpoint is the same as the GatorLink username, and the password is the same as your GatorLink password. If you are having problems with your password, please submit a support request through the [http://helpdesk.ufl.edu/ UF Computing Help Desk].
 
 
====Shared Collections====
 
UFRC users automatically have Globus Plus user status on UFRC managed endpoints, so they are able to create ''shared collections'', which do not require a Research Computing account to connect to, only a Globus user account. In this mode Globus acts as a secure high-performance equivalent of Dropbox and other similar services. See [https://docs.globus.org/how-to/share-files/ Globus Sharing documentation] to learn how to create shared collections.
 
 
==Transferring Data==
 
For extended instructions on transferring data now that Globus is set up, view instructions at [[Transfer Data with Globus]]
 

Latest revision as of 14:14, 31 January 2024

Globus is an easy-to-use, high-performance data transfer tool developed by the Computation Institute, the University of Chicago and Argonne National Laboratory. UF Research Computing has deployed Globus as one mechanism to facilitate data transfer to and from HiPerGator.

Globus uses a grid-FTP network which uses the power of multiple servers to simultaneously transfer data.

UFRC maintains a managed Globus setup with multiple servers for the highest available bandwidth and filesystem throughput. The HPG managed endpoint is 'UFRC HiperGator'

Globus v5 hpg collection.png

Getting Started

At this time federated users can only upload data to HPG if a UF-based user creates a writable guest collection for them

Getting a Globus Account

Globus will redirect you to UF GatorLink authentication when you log into Globus.org. If you created your globus.org account before Globus started using the new authentication you'll need to link your old globus @globusid.org account with your @ufl.edu account.

Sharing Data: Quick Start

Compactglobus.png

If you just want to know how to share a directory on one of the HPG filesystems here is a very brief procedure:

  • Go to https://globus.org and log in with your GatorLink credentials after choosing 'University of Florida' as the authentication provider.
  • Type or paste 'UFRC HiPerGator' into the collection search box.
  • Type or paste the shortest path to the directory tree you want to be in E.g. '/blue/mygroup' or '/orange/mygroup' or the full path to the directory you want to share. You may have to give the 'Globus Application' permission to access files in your browser at this point.
  • Browse into the directory you want to share if you didn't paste the full path above.
  • Click on the 'Share' menu item.
  • Give the share a name and click on 'Create Collection'. A 'Shared Collection' is now set up, but nobody except you has access, yet! If you're having issues creating a collection, use the following link to proceed: Create New Collection.

Note: Everyone you want to give permission via globus has to log into Globus first, so their Globus account would get created and they could tell you their globus identity (username).

  • To give globus users permissions to read or write to that directory click on 'Permissions' > 'Add Permissions - Share With' icon and search for or type/paste a globus username.
  • Choose Read, Write, or both for the permission as appropriate.
  • Add a message that will be sent to the target user by email if you want.
  • Click on 'Add Permission' button.

Do not change the path or switch from 'user - share with specific individuals'. If there is a 'Globus' group (we have to set up one if you really need multiple users in a single group) in existence then you can use a group. Otherwise, do the sharing with particular globus users, which are not necessarily HiPerGator users, so don't confuse Globus users/groups with HPG users/groups.

That's all!

Transferring Data

For extended instructions on transferring data using Globus, view instructions at Transfer Data with Globus

Introduction

A Globus transfer happens between two endpoints or a source and a target of a transfer. From the point of view of a user transferring data the endpoints are generally called 'Collections'. There are two levels of collections. One level defines a particular globus setup and is asociated with a computing resource such as a server, cluster, storage system, desktop, laptop, or other system globus is installed on. For example HiPerGator Globus set up has a 'Managed Collection' called 'UFRC HiPerGator' that allows you access to HPG filesystems you are authorized to access. An install of Globus Connect Personal on your laptop or desktop, on the other hand, would require setting up a 'Personal Collection' and giving it a name that can be searched for in the globus interface on https://globus.org. Since a personal system is only for your use the 'Personal Collection' is all you set up in a GCP install. However, under a 'Managed Collection' like 'UFRC HiPerGator' every user can set many 'Guest Collection' instances on their filesystem directories. The guest collections become endpoints for data transfers whether for you or for other globus users you give read or write access to.

UF Research Computing Globus Setup

We manage a multi-server high-throughput Globus install on HiPerGator under the umbrella of a managed Globus collection named 'UFRC HiPerGator' that allows you to transfer data right away or to set up any number of guest collections on your directories in /home, /blue, and /orange filesystems. The 'UFRC HiPerGator' collection can be found by searching within the Globus File Explorer interface on https://globus.org.

A valid HiPerGator user account is required to access 'UFRC HiPerGator' collection directly. However, any globus user you authorized to access a guest collection created under 'UFRC HiPerGator' can transfer data to or from that shared collection without having a HPG account as they are identified by their globus ID that you gave a permission to access a guest collection to.

If you do not have a Research Computing account, you may request one here. Please note that your username and password to activate 'UFRC HiPerGator' collection are your GatorLink credentials. At this time Federated Users cannot directly access the 'UFRC HiPerGator' collection. If you are having problems with your password, please submit a support request through the UF Computing Help Desk or visit https://account.it.ufl.edu/ for self-service e.g. to reset your GatorLink password.

Globus Connect Personal

Globus provides users with Globus Connect Personal software they can use to set up a personal collection on their computing device (Windows, MacOS, Linux) for the purpose of performing data transfers from or to that device.

You will be asked to name a personal globus shared collection while instaling GCP. Once created, that collection can be found by name when searching within the Globus File Explorer interface on https://globus.org. Once run, GCP allows you to configure which directories on your personal device it has access to and what kind of access. Initially, only you can transfer data to or from a personal computing device running GCP. However, if you are a 'Globus Plus' user status, which all University of Florida members already have because UFRC has a Globus License, you can give access to your personal collection on your personal computing device running GCP to other Globus users by creating Shared Collections the way you would do it with a guest collection on HPG. Be very careful here to avoid mistakes while giving other people access to files on your personal computing device using this functionality. See Client Plus for a nice write-up on this functionality.

Once GCP is running and configured you will use the Globus File Explorer interface at https://globus.org for making transfers using your personal collection presented by the GCP instance on your device. Think of GCP as a personal 'Globus Server' and https://globus.org as the interface for making transfers.

Automation

You can automate Globus transfers to avoid having to run GCP personally. See https://docs.globus.org/how-to/automate-with-service-account/ for details. Let us know if you run into any issues.

UFRC Globus Group

Note: this step is not needed if you are only creating and using guest collections created under a managed 'UFRC HiPerGator' collection! This is only needed for creating shared collections on your personal computing device running GCP!

To be able to create shared end-points (see below) on your local computer running GCP, if the transfer will be made to a non-managed endpoint e.g. another local computer with a personal end-point and you are not a University of Florida member you need to have Globus Plus User status. To obtain that status please log into the Globus Interface, click on Groups menu at the top and select Search For Groups. Search for 'University of Florida Research Computing' and access request to the group. Once approved, you will have Globus Plus User status when running 'Globus Connect Personal' software on your local computer.

Direct link to the groups form: https://app.globus.org/groups

Globus CLI

If you need to initiate Globus transfers from the command line e.g. as a part of a workflow you may want to use the Globus Command-Line Interface (CLI). However, before you can initiate globus transfers with the Globus CLI on a given computer, you will have to log in to Globus using the globus login command. If you're in an X11 environment a web browser will be started for you to perform authentication. This happens if you're, for example, are in a terminal in OnDemand console or desktop session and run

$ module load globus
$ globus login

If you're in a 'headless' environment where GUI programs cannot run use the --no-local-server argument.

Example:

$ globus login --no-local-server
Please log into Globus here:
---------------------------
https://auth.globus.org/v2/oauth2/authorize?code_challenge=0dlql_6KJbguIncy3-NH
0w8oBOXABa7DqvZ2O2K4bGs&state=_default&redirect_uri=https%3A%2F%2Fauth.globus.o
rg%2Fv2%2Fweb%2Fauth-code&prefill_named_grant=login3.stampede2.tacc.utexas.edu&
response_type=code&client_id=95fdeba8-fac2-42bd-a357-e068d82ff78e&scope=openid+
profile+email+urn%3Aglobus%3Aauth%3Ascope%3Atransfer.api.globus.org%3Aall&code_
challenge_method=S256&access_type=offline
---------------------------

Enter the resulting Authorization Code here:

Copy the URL from the shell and visit that web page in a web browser on any computer. If you are not currently logged in to Globus on that computer, you will be asked to log in at this time. You may log in with your Globus ID or an ID from any organization, such as University of Florida.

You will be taken to a page where an authorization code is displayed. Copy the code to your computer's clipboard.

Paste the authorization code into the shell where you ran the "globus login --no-local-server" command, as a response to its prompt.

Enter the resulting Authorization Code here: 1zIOeE6leaZpnk0Fxfmi1J8UauARmF

You have successfully logged in to the Globus CLI as johnqpublic@globusid.org

You can always check your current identity with
  globus whoami

Logout of the Globus CLI with
  globus logout

You will now be able to issue Globus CLI commands on this computer. You can check your Globus login status at any time with the command globus whoami.