Ollama

Description

ollama website

Get up and running with large language models.

Environment Modules

Run module spider ollama to find out what environment modules are available for this application.

System Variables

HPC_OLLAMA_DIR - installation directory

Additional Information

Interactive OLLAMA use

Users need to start an interactive HiperGator Desktop session session on a GPU node at Open Ondemand (https://ood.rc.ufl.edu/) and launch two terminals, one to start the ollama server and the other to chat with LLMs.

In terminal 1, load the ollama module and start the server with either default or custom environmental settings:

1. Default settings
   $ ml ollama
   $ ollama serve (default environmental variables).

2. Custom settings
   $ ml ollama
   $ env {options} ollama serve (pass environmental variables to server). 
   
   For example: set custom path to LLMs models, set host to 127.0.0.1:11435, keep models in memory for 1 hour, and utilize all assigned gpus:
   
   $ env OLLAMA_MODELS=/blue/group/$USER/ollama/models OLLAMA_HOST=127.0.0.1:11435 OLLAMA_KEEP_ALIVE=60m OLLAMA_SCHED_SPREAD=T ollama serve

In terminal 2, pull a model and start chatting. For example, llama3.2:

  $ ml ollama
  $ ollama pull llama3.2
  $ ollama run llama3.2

OLLAMA as a Slurm job

#!/bin/bash
#SBATCH --job-name=ollama
#SBATCH --output=ollama_%j.log
#SBATCH --ntasks=1
#SBATCH --mem=8gb
#SBATCH --partition=gpu
#SBATCH --gpus=a100:1
#SBATCH --time=01:00:00
date;hostname;pwd
module load ollama
env_path=/my/conda/env/bin 
#add conda env with langchain to path
export PATH=$env_path:$PATH
ollama serve &
python my_ollama_python_script.py >> my_ollama_output.txt

Example python script:

from langchain.llms import Ollama
ollama = Ollama(base_url='http://localhost:11434', model="llama3.2")
print(ollama.invoke("why is the sky blue"))