Difference between revisions of "Jupyter Notebooks"

From UFRC
Jump to navigation Jump to search
 
(17 intermediate revisions by 3 users not shown)
Line 1: Line 1:
 
[[Category:Software]][[Category:Python]][[Category:AI]]
 
[[Category:Software]][[Category:Python]][[Category:AI]]
 
+
__TOC__
  
 
{{Note|Jupyter Notebooks are only accessible from within the UF network. Use the [https://vpn.ufl.edu VPN] if off campus.|warn}}
 
{{Note|Jupyter Notebooks are only accessible from within the UF network. Use the [https://vpn.ufl.edu VPN] if off campus.|warn}}
  
 +
=Available Options for Starting and Connecting to a Jupyter Notebook Server=
 
[[File:Jupyter.png|250px|right]]
 
[[File:Jupyter.png|250px|right]]
 
[https://jupyter.org/ Jupyter] Notebooks are a popular web-based development environment for teaching, testing and development and running code. Notebooks allow seamless integrations of live code, richly formatted text, images, visualizations, cleanly formatted equations and more. Jupyter supports [https://github.com/jupyter/jupyter/wiki/Jupyter-kernels many programming languages], but is most often associated with Python.
 
[https://jupyter.org/ Jupyter] Notebooks are a popular web-based development environment for teaching, testing and development and running code. Notebooks allow seamless integrations of live code, richly formatted text, images, visualizations, cleanly formatted equations and more. Jupyter supports [https://github.com/jupyter/jupyter/wiki/Jupyter-kernels many programming languages], but is most often associated with Python.
  
UF Research Computing offers several methods to run Jupyter. This page provides general information about Jupyter, Jupyter Notebooks and Jupyter Lab. For details on starting Jupyter on HiPerGator, please see the pages below for detailed information on each option.
+
UF Research Computing offers several methods to run Jupyter. This page provides general information about Jupyter, Jupyter Notebooks and Jupyter Lab. For details on starting Jupyter on HiPerGator, please see the pages below for detailed information on each option. ''In general, these options are listed in the order of ease of use.''
 
* '''[[JupyterHub]]''': An easy, one click option to start a Jupyter Lab server with resources selected from a simple dropdown menu. All JupyterHub jobs run using your primary group's resources.
 
* '''[[JupyterHub]]''': An easy, one click option to start a Jupyter Lab server with resources selected from a simple dropdown menu. All JupyterHub jobs run using your primary group's resources.
 
* '''[[Jupyter OOD|Jupyter via Open on Demand]]''': An easy method to start a Jupyter Lab server that offers additional configuration options beyond the dropdown available in JupyterHub. Jupyter via OOD allows user configurable resource requests, group selection and other options.
 
* '''[[Jupyter OOD|Jupyter via Open on Demand]]''': An easy method to start a Jupyter Lab server that offers additional configuration options beyond the dropdown available in JupyterHub. Jupyter via OOD allows user configurable resource requests, group selection and other options.
Line 29: Line 30:
 
* The commands below show the commands needed to change directories to your home directory (<code>cd</code>) and create links to the fictional gator-group <code>/blue</code> and <code>/orange</code> directories.
 
* The commands below show the commands needed to change directories to your home directory (<code>cd</code>) and create links to the fictional gator-group <code>/blue</code> and <code>/orange</code> directories.
 
  [agator@login4 ~]$cd
 
  [agator@login4 ~]$cd
  [agator@login4 ~]$ln -s /blue/gator-group blue_gator-group
+
  [agator@login4 ~]$ln -s /blue/gator-group blue_gator-group/
  [agator@login4 ~]$ln -s /orange/gator-group orange_gator-group
+
  [agator@login4 ~]$ln -s /orange/gator-group orange_gator-group/
  
 
Then, you'll see 'blue_gator-group' or 'orange_gator-group' as a folders in your home directory in JupyterLab and will be able to double-click on those to browse the directories.
 
Then, you'll see 'blue_gator-group' or 'orange_gator-group' as a folders in your home directory in JupyterLab and will be able to double-click on those to browse the directories.
 +
 +
=Exporting Notebooks as Executable Scripts=
 +
Notebooks are a great method for testing and development, but can be cumbersome when it comes to production runs. It is simple to export a Jupyter Notebook as an executable script (.py file for example).
 +
* Select File > Export Notebook As... > Export Notebook to Executable Script.
 +
 +
{{Note|You can also export notebooks as PDFs, HTML and other formats.|reminder}}
 +
 +
=Available Kernels=
 +
'''Note:''' We happily add python or R packages/modules to available environments/kernels. Use our [https://support.rc.ufl.edu support system] to request package addition. All kernels are based on environment modules that can also be loaded in an interactive terminal session or in job scripts with 'module load MODULE'.
 +
* 'Python3 VERSION (basic)' - the default python3 kernel from the jupyterhub install, which doesn't have much in it. You likely don't want to use it unless you pip install all modules you want to use yourself.
 +
* 'Python3 VERSION (full)' - our main HiPerGator 'python' modules with "Everything and the kitchen sink" in them as far as python modules are concerned. 'Full' python kernels represents our major generic python environments for general-purpose work and development. Since packages can be installed or updated and their api can change do not rely on this environment for long-term project work if code stability is paramount.
 +
* 'PyViz-0.10.0' - (pyviz module) Environment for [https://pyviz.org/ https://pyviz.org/] based data analysis and plotting environment.
 +
* 'R VERSION (full)' - (R modules) Main R environment module(s) from HiPerGator with everything and a kitchen sink as far as packages are concerned.
 +
* Pytorch-VERSION and Tensorflow-VERSION kernels represent current stable versions of PyTorch and TensorFlow and all the package installed within those environments on request. We're happy to install more packages in those environments, but these are inherently more fragile than general python environments because ML/DL field is moving so fast and rife with incompatible packages. When you module load pytorch or tensorflow or use one of the kernels they will detect whether they are running in a CPU-only or a GPU environment end load the correct environment.
 +
 +
All other kernels are application-specific. Their installation requests are documented in our support system and on this help site.
 +
 +
{{Note|For directions on setting up your own Julia kernel, please see the [[Julia]] page.|reminder}}
 +
 +
==Personal Kernels==
 +
 +
{{Note|For a more thorough treatment of this topic, please see the [[Managing Python environments and Jupyter kernels]] page.|reminder}}
 +
 +
Users can define their own Jupyter kernels for use in JupyterHub. See [https://jupyter-client.readthedocs.io/en/stable/kernels.html https://jupyter-client.readthedocs.io/en/stable/kernels.html]
 +
 +
In short, kernel definitions can be put into <code>~/.local/share/jupyter/kernels</code> directory. See <code>/apps/jupyterhub/kernels/</code> for examples of how we define commonly used kernels. You can also copy a template kernel from <code>/apps/jupyterhub/template_kernel</code>. Replace the placeholder paths and strings in the template files <code>run.sh</code> and <code>kernel.json</code> in accordance to your conda environment configuration.
 +
 +
'''Note:''' Even though the <code>kernel.json</code> defines the <code>display_name</code>, the folder name must also be unique. You cannot just copy a folder and update the contents of the <code>kernel.json</code> and <code>run.sh</code> files, '''you also need to rename the folder'''.
 +
 +
To troubleshoot issues with personal kernels, check the log files at <code>~/ondemand/data/sys/dashboard/batch_connect/sys/jupyter/output/YOUR_SESSION_ID/output.log</code>

Latest revision as of 16:20, 20 July 2022

Jupyter Notebooks are only accessible from within the UF network. Use the VPN if off campus.

Available Options for Starting and Connecting to a Jupyter Notebook Server

Jupyter.png

Jupyter Notebooks are a popular web-based development environment for teaching, testing and development and running code. Notebooks allow seamless integrations of live code, richly formatted text, images, visualizations, cleanly formatted equations and more. Jupyter supports many programming languages, but is most often associated with Python.

UF Research Computing offers several methods to run Jupyter. This page provides general information about Jupyter, Jupyter Notebooks and Jupyter Lab. For details on starting Jupyter on HiPerGator, please see the pages below for detailed information on each option. In general, these options are listed in the order of ease of use.

  • JupyterHub: An easy, one click option to start a Jupyter Lab server with resources selected from a simple dropdown menu. All JupyterHub jobs run using your primary group's resources.
  • Jupyter via Open on Demand: An easy method to start a Jupyter Lab server that offers additional configuration options beyond the dropdown available in JupyterHub. Jupyter via OOD allows user configurable resource requests, group selection and other options.
  • Batch job submission: This page details methods to submit a batch job and connect via ssh tunnels.

Accessing Blue and Orange Directories

At first, Jupyter only has access to your home directory (/home/gatorlink). In order to access directories outside of your home, it is necessary to add links to those directories using the command line. These links are similar to aliases or shortcuts on your computer. Common directories to add are your groups' /blue and /orange directories.

Open a Terminal

Jupyter Launcher Terminal.png

We will need a terminal to run the commands below. You can use an ssh client, the OOD Shell Access or launch a Terminal within your Jupyter Server. The image on the right shows the Terminal Launcher button at the bottom of the Launcher panel in JupyterLab. If needed, you can open the Launcher with the '+' icon in the top left or from the File menu, select New Launcher.

Create the Link

The specific type of link we want to create is referred to as a symbolic link or symlink. The format of the command used to create this link is ln -s path_to_link_to name_of_link.

  • In general, we recommend making a link to your group's directory. This allows you to use the group's share folder and more easily collaborate with others in the group than if you made the link to your own folder within the group directory.
  • Since people are often in multiple groups, we recommend naming the link with the convention blue_group. This allows for multiple links to each group directory.
  • The id command will show you the groups you are a member of:
[agator@login4 ~]$ id
uid=12345(agator) gid=12345(gator-group) groups=12345(gator-group),12346(orange-group),12347(blue-group)
[agator@login4 ~]$ 
  • The commands below show the commands needed to change directories to your home directory (cd) and create links to the fictional gator-group /blue and /orange directories.
[agator@login4 ~]$cd
[agator@login4 ~]$ln -s /blue/gator-group blue_gator-group/
[agator@login4 ~]$ln -s /orange/gator-group orange_gator-group/

Then, you'll see 'blue_gator-group' or 'orange_gator-group' as a folders in your home directory in JupyterLab and will be able to double-click on those to browse the directories.

Exporting Notebooks as Executable Scripts

Notebooks are a great method for testing and development, but can be cumbersome when it comes to production runs. It is simple to export a Jupyter Notebook as an executable script (.py file for example).

  • Select File > Export Notebook As... > Export Notebook to Executable Script.
You can also export notebooks as PDFs, HTML and other formats.

Available Kernels

Note: We happily add python or R packages/modules to available environments/kernels. Use our support system to request package addition. All kernels are based on environment modules that can also be loaded in an interactive terminal session or in job scripts with 'module load MODULE'.

  • 'Python3 VERSION (basic)' - the default python3 kernel from the jupyterhub install, which doesn't have much in it. You likely don't want to use it unless you pip install all modules you want to use yourself.
  • 'Python3 VERSION (full)' - our main HiPerGator 'python' modules with "Everything and the kitchen sink" in them as far as python modules are concerned. 'Full' python kernels represents our major generic python environments for general-purpose work and development. Since packages can be installed or updated and their api can change do not rely on this environment for long-term project work if code stability is paramount.
  • 'PyViz-0.10.0' - (pyviz module) Environment for https://pyviz.org/ based data analysis and plotting environment.
  • 'R VERSION (full)' - (R modules) Main R environment module(s) from HiPerGator with everything and a kitchen sink as far as packages are concerned.
  • Pytorch-VERSION and Tensorflow-VERSION kernels represent current stable versions of PyTorch and TensorFlow and all the package installed within those environments on request. We're happy to install more packages in those environments, but these are inherently more fragile than general python environments because ML/DL field is moving so fast and rife with incompatible packages. When you module load pytorch or tensorflow or use one of the kernels they will detect whether they are running in a CPU-only or a GPU environment end load the correct environment.

All other kernels are application-specific. Their installation requests are documented in our support system and on this help site.

For directions on setting up your own Julia kernel, please see the Julia page.

Personal Kernels

For a more thorough treatment of this topic, please see the Managing Python environments and Jupyter kernels page.

Users can define their own Jupyter kernels for use in JupyterHub. See https://jupyter-client.readthedocs.io/en/stable/kernels.html

In short, kernel definitions can be put into ~/.local/share/jupyter/kernels directory. See /apps/jupyterhub/kernels/ for examples of how we define commonly used kernels. You can also copy a template kernel from /apps/jupyterhub/template_kernel. Replace the placeholder paths and strings in the template files run.sh and kernel.json in accordance to your conda environment configuration.

Note: Even though the kernel.json defines the display_name, the folder name must also be unique. You cannot just copy a folder and update the contents of the kernel.json and run.sh files, you also need to rename the folder.

To troubleshoot issues with personal kernels, check the log files at ~/ondemand/data/sys/dashboard/batch_connect/sys/jupyter/output/YOUR_SESSION_ID/output.log