Difference between revisions of "Transfer Data"

From UFRC
Jump to navigation Jump to search
 
(87 intermediate revisions by 8 users not shown)
Line 1: Line 1:
[[Category:How-To]]
+
[[Category:Essentials]][[Category:Transfer Data]][[Category:Data]]
 +
{|align=right
 +
  |__TOC__
 +
  |}
 
==Overview==
 
==Overview==
This document describes transferring data between a local computer (client) and the HiPerGator (HPG). For the duration of the HiPerGator1 to HiPerGator2 (HPG1->HPG2) transition both HPG1 and HPG2 information will be presented.
+
This document describes transferring data between a local computer (client) and HiPerGator (HPG). For file sharing on the cluster see [https://help.rc.ufl.edu/doc/Sharing_Files Sharing Files]
  
=Tools=
+
* Note: ResVault users: [https://help.rc.ufl.edu/doc/How_To_ResVault#How_to_Transfer_Data_from.2Fto_RV find more info in this page]
There is a variety of command-line, Gui, and even web-based tools available for transferring data to or from HiPerGator and between HPG1 and HPG2. Some examples include 'cp, mv, scp, [[Rsync|rsync]], or sftp' on the command-line, FileZilla, Cyberduck, WinSCP, or MobaXTerm Gui sotware, or [[Globus|Globus data transfer tool]] available via a web interface in addition to command-line and Gui versions. The instructions below can be applied to virtuall all of the tools mentioned above.
 
  
=Servers=
+
==Tools==
;Note: using login (gator) nodes via gator.rc.ufl.edu or hpg2.rc.ufl.edu for sftp transfers is no longer allowed as the connections overwhelm the load-balancing solution that makes the login servers highly-available. Use the dedicated transfer servers for the task.
+
There are a variety of command-line, GUI, and even web-based tools available for transferring data to or from HiPerGator. Command-line tools include, for example, 'cp, mv, scp, [[Rsync|rsync]], sftp, wget, curl, or ncftp (from the ncftp environment module]'. GUI tools like Cyberduck, WinSCP, BitVise SFTP and similar tools can be used on your local computer. [[Globus|Globus data transfer tool]] is available via a web interface in addition to command-line and GUI versions that can be run on your local computer. [https://help.rc.ufl.edu/doc/Training_Videos#data%20transfer Please visit the HPG how-to video series on Data Transfer for more details.]
  
==HiPerGator1==
+
==Transferring Data within HiPerGator==
To transfer data to or from HiPerGator1 with its main high-performance filesystem available at /scratch/lfs connect to the '<code>rsync.rc.ufl.edu</code>' data transfer server. The name of this server will be changed to point to HiPerGator2 after the HPG1->HPG2 transition is complete.
+
Login servers on HiPergator (hpg.rc.ufl.edu) can be used for [[Rsync|rsync]], copying files with cp or mv depending on the group configuration and permissions or filesystem ACLs (extended permissions). There is a shared directory at /blue/GROUP/share for groups that prefer to share their data between group members. See also: [https://help.rc.ufl.edu/doc/Storage#Shared_Work_and_Storage_Management Shared Work and Storage Management], [[Sharing Within A Cluster]]
  
==HiPerGator2==
+
==Transferring Data Outside of HiPerGator==
To transfer data to or from HiPerGator 2.0 ,with its main high-performance filesystem available at /ufrc, connect to the '<code>sftp.rc.ufl.edu</code>' data transfer server.
+
===Between a local computer and HiPerGator===
 +
====Open OnDemand====
 +
Open OnDemand has file management tools to transfer files (less than 20Gb at the time). See [https://help.rc.ufl.edu/doc/Open_OnDemand Open OnDemand] for details
  
==HPG1->HPG2==
+
====Globus====
There are two main mechanisms of data transfers between HiPerGator1 (HPG1) and HiPerGator2 (HPG2). For large file a ''streaming'' approach of [[Globus]] works very well. For many small files a 'cp' or [[Rsync|'rsync']] will work better.
+
If your data files are large (hundreds of megabytes or gigabytes) then try [[Globus]] first.
  
To transfer data between /scratch/lfs and /ufrc with cp or [[Rsync|rsync]] log into '<code>dtn1.ufhpc</code>' from any other node within HiPerGator. This server has a read-only connection to /scratch/lfs, so the data must flow from /scratch/lfs to /ufrc, but since both filesystems are available at the same time both the linux copy command '<code>cp</code>' and '<code>rsync</code>' will work.
+
====Samba====
 +
 
 +
Samba is a Linux server that provides remote filesystem access using SMB/CIFS protocol. See [https://help.rc.ufl.edu/doc/Samba_Access Samba Main Wiki Page] for more info.
 +
 
 +
====SFTP/Rsync====
 +
For smaller file sizes, or if Globus is not an option, use SFTP/[[Rsync|rsync]]/scp by connecting to the '<code>hpg.rc.ufl.edu</code>' or '<code>sftp.rc.ufl.edu</code>'. Set port to 22 if you have to specify it. Use your gatorlink credentials to connect. Make sure [[Authentication_With_MFA#Using_MFA_with_GUI_SFTP_Programs|MFA (multi-factor authentication)]] is taken into account to avoid having to go through MFA for every file transferred. You can also use terminal interfaces like [https://www.bitvise.com/ BitVise], which include an GUI SFTP function embedded.
 +
 
 +
;Note: If you are using [https://cyberduck.io/ Cyberduck] on a MacOS computer and notice that it is using the old password resulting in a security lockout uncheck 'Use Keychain' in 'Preferences > General'.
 +
 
 +
====Rclone====
 +
[[Rclone]] is a command-line data transfer tool that support transferring data from or to dozens of cloud storage providers and remote locations. Please see [https://rclone.org/docs/ https://rclone.org/docs/] for information on a specific storage provider.
 +
 
 +
====JupyterHub====
 +
The easiest way to upload data using JypyterHub is by using the Upload Files button included in the File Browser (Ctrl+Shift+F) menu.<br>
 +
 
 +
[[File:Jupyterhub upload.png|frameless]]
 +
 
 +
===From HiPerGator to a remote system/site===
 +
If you are logged into HiPerGator and need to transfer data to or from a remote system or a site use the login nodes. Transfers can also be made from within [[Development_and_Testing|developmental sessions]]. You can use ftp (lftp command), sftp, scp, rsync, or Globus to transfer data out.
 +
 
 +
==Downloading Data from an External Server or Third-Party Website==
 +
====Terminal: wget or curl====
 +
[[File:Copy Link Address.jpg|thumb]]In most file managing websites, the "Download" buttons have the option to right-click and select an option similar to "Copy link address".
 +
 
 +
Once the link address is copied in the clipboard, you can paste it in your HPG terminal following your "wget" or "curl" commands, such as:
 +
<code>$ wget https://github.com/author/software/main.zip"</code>
 +
 
 +
====Internet Browser from HPG Host====
 +
Another option is to use the Firefox or Chrome browsers via [[Open OnDemand]] or an X11-based SSH session.
 +
 
 +
Start a X11 forwading-enabled SSH session or a terminal on a HiPerGator Desktop session on Open OnDemand:
 +
 
 +
1. From a HPG node, launch the module ubuntu, which contains a version of the Google Chrome browser.
 +
 
 +
    - $ module load ubuntu
 +
    - $ chrome (or firefox)
 +
    The Internet browser should launch.
 +
2. Navigate and login to the third-party service, such as Dropbox or Google Drive using your gatorlink account.
 +
 
 +
3. Edit the browser's download settings to change the Download location (path) to your preferred /blue directory.
 +
 
 +
4. Navigate to the target directory in your Drive account using the browser and select files or folders to download.
 +
  The files should download directly to your selected /blue path.
 +
 
 +
====Download data to personal computer first====
 +
Alternatively, users can download files to their local computers and upload them into HPG using one of the methods described in the "Between a local computer and HiPerGator" section above.
 +
 
 +
==Providing access to your data==
 +
If you need to share data with another RC user or publicly for collaboration, view instructions at [[Providing Access To Data]]

Latest revision as of 23:13, 29 March 2024

Overview

This document describes transferring data between a local computer (client) and HiPerGator (HPG). For file sharing on the cluster see Sharing Files

Tools

There are a variety of command-line, GUI, and even web-based tools available for transferring data to or from HiPerGator. Command-line tools include, for example, 'cp, mv, scp, rsync, sftp, wget, curl, or ncftp (from the ncftp environment module]'. GUI tools like Cyberduck, WinSCP, BitVise SFTP and similar tools can be used on your local computer. Globus data transfer tool is available via a web interface in addition to command-line and GUI versions that can be run on your local computer. Please visit the HPG how-to video series on Data Transfer for more details.

Transferring Data within HiPerGator

Login servers on HiPergator (hpg.rc.ufl.edu) can be used for rsync, copying files with cp or mv depending on the group configuration and permissions or filesystem ACLs (extended permissions). There is a shared directory at /blue/GROUP/share for groups that prefer to share their data between group members. See also: Shared Work and Storage Management, Sharing Within A Cluster

Transferring Data Outside of HiPerGator

Between a local computer and HiPerGator

Open OnDemand

Open OnDemand has file management tools to transfer files (less than 20Gb at the time). See Open OnDemand for details

Globus

If your data files are large (hundreds of megabytes or gigabytes) then try Globus first.

Samba

Samba is a Linux server that provides remote filesystem access using SMB/CIFS protocol. See Samba Main Wiki Page for more info.

SFTP/Rsync

For smaller file sizes, or if Globus is not an option, use SFTP/rsync/scp by connecting to the 'hpg.rc.ufl.edu' or 'sftp.rc.ufl.edu'. Set port to 22 if you have to specify it. Use your gatorlink credentials to connect. Make sure MFA (multi-factor authentication) is taken into account to avoid having to go through MFA for every file transferred. You can also use terminal interfaces like BitVise, which include an GUI SFTP function embedded.

Note
If you are using Cyberduck on a MacOS computer and notice that it is using the old password resulting in a security lockout uncheck 'Use Keychain' in 'Preferences > General'.

Rclone

Rclone is a command-line data transfer tool that support transferring data from or to dozens of cloud storage providers and remote locations. Please see https://rclone.org/docs/ for information on a specific storage provider.

JupyterHub

The easiest way to upload data using JypyterHub is by using the Upload Files button included in the File Browser (Ctrl+Shift+F) menu.

Jupyterhub upload.png

From HiPerGator to a remote system/site

If you are logged into HiPerGator and need to transfer data to or from a remote system or a site use the login nodes. Transfers can also be made from within developmental sessions. You can use ftp (lftp command), sftp, scp, rsync, or Globus to transfer data out.

Downloading Data from an External Server or Third-Party Website

Terminal: wget or curl

Copy Link Address.jpg

In most file managing websites, the "Download" buttons have the option to right-click and select an option similar to "Copy link address".

Once the link address is copied in the clipboard, you can paste it in your HPG terminal following your "wget" or "curl" commands, such as: $ wget https://github.com/author/software/main.zip"

Internet Browser from HPG Host

Another option is to use the Firefox or Chrome browsers via Open OnDemand or an X11-based SSH session.

Start a X11 forwading-enabled SSH session or a terminal on a HiPerGator Desktop session on Open OnDemand:

1. From a HPG node, launch the module ubuntu, which contains a version of the Google Chrome browser.

   - $ module load ubuntu
   - $ chrome (or firefox)
   The Internet browser should launch.

2. Navigate and login to the third-party service, such as Dropbox or Google Drive using your gatorlink account.

3. Edit the browser's download settings to change the Download location (path) to your preferred /blue directory.

4. Navigate to the target directory in your Drive account using the browser and select files or folders to download.

  The files should download directly to your selected /blue path.

Download data to personal computer first

Alternatively, users can download files to their local computers and upload them into HPG using one of the methods described in the "Between a local computer and HiPerGator" section above.

Providing access to your data

If you need to share data with another RC user or publicly for collaboration, view instructions at Providing Access To Data