Jupyter Notebooks: Difference between revisions

From UFRC
Jump to navigation Jump to search
m Jupyter via Open OnDemand: : Added note about 72 hour limit when launched from OOD.
 
(29 intermediate revisions by 3 users not shown)
Line 1: Line 1:
[[Category:Software]][[Category:Python]][[Category:Machine Learning]]
[[Category:Software]][[Category:Python]][[Category:Machine Learning]] __NOTOC__
__TOC__
==Available Options for Starting and Connecting to a Jupyter Notebook Server==
 
=Available Options for Starting and Connecting to a Jupyter Notebook Server=
[[File:Jupyter.png|250px|right]]
[[File:Jupyter.png|250px|right]]
[https://jupyter.org/ Jupyter] Notebooks are a popular web-based development environment for teaching, testing and development and running code. Notebooks allow seamless integrations of live code, richly formatted text, images, visualizations, cleanly formatted equations and more. Jupyter supports [https://github.com/jupyter/jupyter/wiki/Jupyter-kernels many programming languages], but is most often associated with Python.
[https://jupyter.org/ Jupyter] Notebooks are a popular web-based development environment for teaching, testing and development and running code. Notebooks allow seamless integrations of live code, richly formatted text, images, visualizations, cleanly formatted equations and more. Jupyter supports [https://github.com/jupyter/jupyter/wiki/Jupyter-kernels many programming languages], but is most often associated with Python.


UF Research Computing offers several methods to run Jupyter. This page provides general information about Jupyter, Jupyter Notebooks and Jupyter Lab. For details on starting Jupyter on HiPerGator, please see the pages below for detailed information on each option. ''In general, these options are listed in the order of ease of use.''
UF Research Computing offers several methods to run Jupyter. This page provides general information about Jupyter, Jupyter Notebooks and Jupyter Lab. For details on starting Jupyter on HiPerGator, please see the pages below for detailed information on each option. ''In general, these options are listed in the order of ease of use.''
{{Note|JupyterHub can only use resources from your primary group's allocation. If you need yo use resources from a secondary group, please use [[Jupyter OOD|Jupyter via Open on Demand]].|warn}}
{{Note|JupyterHub can only use resources from your primary group's allocation. If you need to use resources from a secondary group, please use [[Jupyter#Jupyter via Open OnDemand|Jupyter via Open OnDemand]].|warn}}
{{Note|Remember that leaving idle Jupyter servers running wastes resources and prevents you and others from using those resources. See below of information on how to stop your server when you are done working.|reminder}}
{{Note|Remember that leaving idle Jupyter servers running wastes resources and prevents you and others from using those resources. See below of information on how to stop your server when you are done working.|reminder}}
==JupyterHub==
===JupyterHub===
 


<div class="mw-collapsible mw-collapsed" style="width: 80%; padding: 5px; border: 1px solid gray;">
''Expand for JupyterHub: an easy, one click option to start a Jupyter Lab server with resources selected from a simple dropdown menu. All JupyterHub jobs run using your primary group's resources.''
<div class="mw-collapsible-content" style="padding: 5px;">
If you are looking for a convenient way to run JupyterLab notebooks try '''[https://jhub.rc.ufl.edu UFRC JupyterHub]''' service. It presents a convenient web interface to start notebooks, consoles, or terminals with multiple custom kernels and several job resource request profiles, which we can expand on request to satisfy your needs.  
If you are looking for a convenient way to run JupyterLab notebooks try '''[https://jhub.rc.ufl.edu UFRC JupyterHub]''' service. It presents a convenient web interface to start notebooks, consoles, or terminals with multiple custom kernels and several job resource request profiles, which we can expand on request to satisfy your needs.  


Line 30: Line 30:
* After clicking Start, your job is submitted to the SLURM scheduler requesting the resources indicated. It may take a few minutes or longer for your job to start depending on cluster load and your group's available resources.
* After clicking Start, your job is submitted to the SLURM scheduler requesting the resources indicated. It may take a few minutes or longer for your job to start depending on cluster load and your group's available resources.


===Stopping Your Server When Finished===
====Stopping Your Server When Finished====
 
[[File:JHub hubcontrol.png|200px|frameless|right]]
[[File:JHub hubcontrol.png|150px|frameless|right]]
If you are done with your work, rather than leaving the server running with idle resources tied up with your job, please stop your server.
If you are done with your work, rather than leaving the server running with idle resources tied up with your job, please stop your server.
* From the File menu, select '''Hub Control Panel'''
* From the File menu, select '''Hub Control Panel'''
* As noted above, there is a caching bug in JupyterHub that may result in you having a different user's username in the top right of the page. If that is the case, do a hard refresh of the page.  
* As noted above, there is a caching bug in JupyterHub that may result in you having a different user's username in the top right of the page. If that is the case, do a hard refresh of the page. [[File:JHub StopServer.png |350px]]
[[File:JHub StopServer.png|frameless|center]]
* After verifying your username, click the Stop My Server button.
* After verifying your username, click the Stop My Server button.


Line 46: Line 44:
* Launch a terminal
* Launch a terminal
* Upload and download files via your browser
* Upload and download files via your browser
</div>
</div>


==Jupyter via Open OnDemand==
===Jupyter via Open OnDemand===
<div class="mw-collapsible mw-collapsed" style="width:80%; padding: 5px; border: 1px solid gray;">
''Expand Jupyter via OOD: an easy method to start a Jupyter Lab server that offers additional configuration options beyond the dropdown available in JupyterHub. Jupyter via OOD allows user configurable resource requests, group selection and other options.  Please note, notebooks running in the GPU partition are limited to 72 hours of runtime.''
<div class="mw-collapsible-content" style="padding: 5px;">
[[Open OnDemand| Open OnDemand]] is a service that provides web-based access to HiPerGator, including JupyterLab and Jupyter Notebooks.
[[Open OnDemand| Open OnDemand]] is a service that provides web-based access to HiPerGator, including JupyterLab and Jupyter Notebooks.


Line 73: Line 76:
* Once your job starts the Connect button will appear. Click that to connect to your Jupyter session.
* Once your job starts the Connect button will appear. Click that to connect to your Jupyter session.


===Reconnecting to Running Sessions===
====Reconnecting to Running Sessions====
[[File:OOD MyInteractiveSessions.png | right]]
You can close your browser window and reconnect to existing sessions using the My Interactive Sessions menu (sometimes shown with just the icon on smaller screens).
You can close your browser window and reconnect to existing sessions using the My Interactive Sessions menu (sometimes shown with just the icon on smaller screens).
[[File:OOD MyInteractiveSessions.png|center]]


===Deleting Running Sessions===
====Deleting Running Sessions====
Your Jupyter session will run for the time you selected, consuming the allocated resources. If you are finished with your analyses, you can release those resources by deleting the job. Using the My Interactive Sessions menu shown above, find the session and click the Delete button. The Delete button stops the SLURM job. The Notebooks/files/etc. created in your job are not deleted.
Your Jupyter session will run for the time you selected, consuming the allocated resources. If you are finished with your analyses, you can release those resources by deleting the job. Using the My Interactive Sessions menu shown above, find the session and click the Delete button. The Delete button stops the SLURM job. The Notebooks/files/etc. created in your job are not deleted.
[[File:OOD Running.png|400px|center]]


===Using the Older Jupyter Notebook Interface===
====Using the Older Jupyter Notebook Interface====  
[[File:OOD Jupyter SimpleNotebooks.png|frameless|right]]By default, the checkbox show at the right is selected, starting your server with the more modern JupyterLab interface. If you want to use the older, simpler Jupyter Notebook interface uncheck the box.
By default, the checkbox show at the right is selected, starting your server with the more modern JupyterLab interface. If you want to use the older, simpler Jupyter Notebook interface uncheck the box.
[[File:OOD Jupyter SimpleNotebooks.png|frameless]]
</div>
</div>


==Standalone Jupyter Notebook==
===Standalone Jupyter Notebook===
<div class="mw-collapsible mw-collapsed" style="width:80%; padding: 5px; border: 1px solid gray;">
''Expand for Standalone Notebooks: details methods to submit a batch job and connect via ssh tunnels..''
<div class="mw-collapsible-content" style="padding: 5px;">
This is a ''manual'' mechanism to start a Jupyter notebook within a SLURM job on HiPerGator and connect to it from the web browser running on your local computer.
This is a ''manual'' mechanism to start a Jupyter notebook within a SLURM job on HiPerGator and connect to it from the web browser running on your local computer.


===Interactive Session===
====Interactive Session====
If you're in a [[Development_and_Testing|dev SLURM session]] then  
If you're in a [[Development_and_Testing|dev SLURM session]] then  
* Note the host name, which you'll need to create an SSH tunnel to your notebook.
* Note the host name, which you'll need to create an SSH tunnel to your notebook.
Line 94: Line 102:
* Create an SSH tunnel from your local computer to the notebook using SSH forwarding (see below).
* Create an SSH tunnel from your local computer to the notebook using SSH forwarding (see below).


===SLURM Job===
====SLURM Job====
If you would like a notebook to live for longer than the 12-hour time limit for dev sessions start it inside a SLURM job.
If you would like a notebook to live for longer than the 12-hour time limit for dev sessions start it inside a SLURM job.


Line 119: Line 127:
;Note: The jupyter environment includes all R and python packages/modules we installed on request.
;Note: The jupyter environment includes all R and python packages/modules we installed on request.


====Connection Information====
=====Connection Information=====


Once the job starts look at the jupyter_notebook_$SLURM_JOBID.log SLURM output file to learn the hostname and the port jupyter notebook was started on. The ssh tunnel and local URI paths should already be there.
Once the job starts look at the jupyter_notebook_$SLURM_JOBID.log SLURM output file to learn the hostname and the port jupyter notebook was started on. The ssh tunnel and local URI paths should already be there.
Line 154: Line 162:
Copy the token to use it as the password the first time you connect to the notebook. In this example the token is <code>06b1c3f73bb847234c198a22bd62b7f20101b04d1bc2b64a</code>.
Copy the token to use it as the password the first time you connect to the notebook. In this example the token is <code>06b1c3f73bb847234c198a22bd62b7f20101b04d1bc2b64a</code>.


==Create Tunnel From Local Machine==
====Create Tunnel From Local Machine====
Copy paste the tunnel command from the job script or write your own based on how you manually started a notebook.
Copy paste the tunnel command from the job script or write your own based on how you manually started a notebook.


ssh -NL 23312:c10b-s14.ufhpc:23312 jdoe@hpg.rc.ufl.edu
ssh -NL 23312:c10b-s14.ufhpc:23312 jdoe@hpg.rc.ufl.edu


==Browse To Notebook==
====Browse To Notebook====
In a web browser on the local machine open [http://localhost:23312 http://localhost:23312]
In a web browser on the local machine open [http://localhost:23312 http://localhost:23312]


Line 167: Line 175:


Again, note that the default Jupyter Notebook setup you see should have at least four kernels - two default kernels (python and R) that come with Jupyter and two additional kernels that provide access to environments provided by RC-specific environment modules e.g. 'RC R-3.5.1' and 'RC Py3-3.6.5', which match the same environment modules you use in batch jobs.
Again, note that the default Jupyter Notebook setup you see should have at least four kernels - two default kernels (python and R) that come with Jupyter and two additional kernels that provide access to environments provided by RC-specific environment modules e.g. 'RC R-3.5.1' and 'RC Py3-3.6.5', which match the same environment modules you use in batch jobs.
</div>
</div>


=Accessing Blue and Orange Directories=
==Accessing Blue and Orange Directories==
<div class="mw-collapsible mw-collapsed" style="width: 80%; padding: 5px; border: 1px solid gray;">
''Instructions for accessing directories outside of your home.''
<div class="mw-collapsible-content" style="padding: 5px;">
At first, Jupyter only has access to your home directory (<code>/home/<i>gatorlink</i></code>). In order to access directories outside of your home, it is necessary to add links to those directories using the command line. These links are similar to aliases or shortcuts on your computer. Common directories to add are your groups' <code>/blue</code> and <code>/orange</code> directories.
At first, Jupyter only has access to your home directory (<code>/home/<i>gatorlink</i></code>). In order to access directories outside of your home, it is necessary to add links to those directories using the command line. These links are similar to aliases or shortcuts on your computer. Common directories to add are your groups' <code>/blue</code> and <code>/orange</code> directories.


Line 189: Line 202:


Then, you'll see 'blue_gator-group' or 'orange_gator-group' as a folders in your home directory in JupyterLab and will be able to double-click on those to browse the directories.
Then, you'll see 'blue_gator-group' or 'orange_gator-group' as a folders in your home directory in JupyterLab and will be able to double-click on those to browse the directories.
</div>
</div>


=Exporting Notebooks as Executable Scripts=
==Exporting Notebooks as Executable Scripts==
Notebooks are a great method for testing and development, but can be cumbersome when it comes to production runs. It is simple to export a Jupyter Notebook as an executable script (.py file for example).
Notebooks are a great method for testing and development, but can be cumbersome when it comes to production runs. It is simple to export a Jupyter Notebook as an executable script (.py file for example).
* Select File > Export Notebook As... > Export Notebook to Executable Script.
* Select File > Export Notebook As... > Export Notebook to Executable Script.
Line 196: Line 211:
{{Note|You can also export notebooks as PDFs, HTML and other formats.|reminder}}
{{Note|You can also export notebooks as PDFs, HTML and other formats.|reminder}}


=Available Kernels=
==Jupyter Kernels==
'''Note:''' We happily add python or R packages/modules to available environments/kernels. Use our [https://support.rc.ufl.edu support system] to request package addition. All kernels are based on environment modules that can also be loaded in an interactive terminal session or in job scripts with 'module load MODULE'. Users can also install their own personal packages with methods described in the [[R]] FAQ section.
===UFRC Managed Kernels===
* 'Python3 VERSION (basic)' - the default python3 kernel from the jupyterhub install, which doesn't have much in it. You likely don't want to use it unless you pip install all modules you want to use yourself.
 
* 'Python3 VERSION (full)' - our main HiPerGator 'python' modules with "Everything and the kitchen sink" in them as far as python modules are concerned. 'Full' python kernels represents our major generic python environments for general-purpose work and development. Since packages can be installed or updated and their api can change do not rely on this environment for long-term project work if code stability is paramount.  
{{Note|The built-in python kernel named 'Python3 (ipykernel)' is empty, but cannot be removed. Please don't use it unless you plan to install all packages on your own.|reminder}}
* 'PyViz-0.10.0' - (pyviz module) Environment for [https://pyviz.org/ https://pyviz.org/] based data analysis and plotting environment.
 
* 'R VERSION (full)' - (R modules) Main R environment module(s) from HiPerGator with everything and a kitchen sink as far as packages are concerned.
We will happily add python or R packages/modules to available environments/kernels. Use the [https://support.rc.ufl.edu RC Support System] to request package installs. All RC managed Jupyter kernels are based on environment modules that can also be loaded in an interactive terminal session or in job scripts with 'module load'. Users can also install their own personal packages with methods described in the [[R]] FAQ section.
* Pytorch-VERSION and Tensorflow-VERSION kernels represent current stable versions of PyTorch and TensorFlow and all the package installed within those environments on request. We're happy to install more packages in those environments, but these are inherently more fragile than general python environments because ML/DL field is moving so fast and rife with incompatible packages. When you module load pytorch or tensorflow or use one of the kernels they will detect whether they are running in a CPU-only or a GPU environment end load the correct environment.
 
We provide custom kernels named '<code>RC-py3-$version</code>' and '<code>RC-R-$version</code>' that provide access hundreds of R packages and python3 modules we installed to support exploratory research and code writing by UF researchers on request. Use [https://support.rc.ufl.edu https://support.rc.ufl.edu] to request additional package and module installs. Note that the shared python3 and R environments can only have one package/module version to avoid conflicts. Use [https://virtualenv.pypa.io python virtualenv] or conda environments to have custom module installs for particular projects as shown below.


All other kernels are application-specific. Their installation requests are documented in our support system and on this help site.
All other kernels are application-specific. Their installation requests are documented in our support system and on this help site.


{{Note|For directions on setting up your own Julia kernel, please see the [[Julia]] page.|reminder}}
For directions on setting up your own '''Julia''' kernel, please see the [[Julia]] page.


==Personal Kernels==
===Personal Kernels===


{{Note|For a more thorough treatment of this topic, please see the [[Managing Python environments and Jupyter kernels]] page.|reminder}}  
{{Note|For a more thorough treatment of this topic, please see the [[Managing Python environments and Jupyter kernels]] page.|reminder}}  

Latest revision as of 18:58, 23 March 2023

Available Options for Starting and Connecting to a Jupyter Notebook Server

Jupyter Notebooks are a popular web-based development environment for teaching, testing and development and running code. Notebooks allow seamless integrations of live code, richly formatted text, images, visualizations, cleanly formatted equations and more. Jupyter supports many programming languages, but is most often associated with Python.

UF Research Computing offers several methods to run Jupyter. This page provides general information about Jupyter, Jupyter Notebooks and Jupyter Lab. For details on starting Jupyter on HiPerGator, please see the pages below for detailed information on each option. In general, these options are listed in the order of ease of use.

JupyterHub can only use resources from your primary group's allocation. If you need to use resources from a secondary group, please use Jupyter via Open OnDemand.
Remember that leaving idle Jupyter servers running wastes resources and prevents you and others from using those resources. See below of information on how to stop your server when you are done working.

JupyterHub

Expand for JupyterHub: an easy, one click option to start a Jupyter Lab server with resources selected from a simple dropdown menu. All JupyterHub jobs run using your primary group's resources.

Jupyter via Open OnDemand

Expand Jupyter via OOD: an easy method to start a Jupyter Lab server that offers additional configuration options beyond the dropdown available in JupyterHub. Jupyter via OOD allows user configurable resource requests, group selection and other options. Please note, notebooks running in the GPU partition are limited to 72 hours of runtime.

Standalone Jupyter Notebook

Expand for Standalone Notebooks: details methods to submit a batch job and connect via ssh tunnels..

Accessing Blue and Orange Directories

Instructions for accessing directories outside of your home.

Exporting Notebooks as Executable Scripts

Notebooks are a great method for testing and development, but can be cumbersome when it comes to production runs. It is simple to export a Jupyter Notebook as an executable script (.py file for example).

  • Select File > Export Notebook As... > Export Notebook to Executable Script.
You can also export notebooks as PDFs, HTML and other formats.

Jupyter Kernels

UFRC Managed Kernels

The built-in python kernel named 'Python3 (ipykernel)' is empty, but cannot be removed. Please don't use it unless you plan to install all packages on your own.

We will happily add python or R packages/modules to available environments/kernels. Use the RC Support System to request package installs. All RC managed Jupyter kernels are based on environment modules that can also be loaded in an interactive terminal session or in job scripts with 'module load'. Users can also install their own personal packages with methods described in the R FAQ section.

We provide custom kernels named 'RC-py3-$version' and 'RC-R-$version' that provide access hundreds of R packages and python3 modules we installed to support exploratory research and code writing by UF researchers on request. Use https://support.rc.ufl.edu to request additional package and module installs. Note that the shared python3 and R environments can only have one package/module version to avoid conflicts. Use python virtualenv or conda environments to have custom module installs for particular projects as shown below.

All other kernels are application-specific. Their installation requests are documented in our support system and on this help site.

For directions on setting up your own Julia kernel, please see the Julia page.

Personal Kernels

For a more thorough treatment of this topic, please see the Managing Python environments and Jupyter kernels page.

Users can define their own Jupyter kernels for use in JupyterHub. See https://jupyter-client.readthedocs.io/en/stable/kernels.html

In short, kernel definitions can be put into ~/.local/share/jupyter/kernels directory. See /apps/jupyterhub/kernels/ for examples of how we define commonly used kernels. You can also copy a template kernel from /apps/jupyterhub/template_kernel. Replace the placeholder paths and strings in the template files run.sh and kernel.json in accordance to your conda environment configuration.

Note: Even though the kernel.json defines the display_name, the folder name must also be unique. You cannot just copy a folder and update the contents of the kernel.json and run.sh files, you also need to rename the folder.

To troubleshoot issues with personal kernels, check the log files at ~/ondemand/data/sys/dashboard/batch_connect/sys/jupyter/output/YOUR_SESSION_ID/output.log