Difference between revisions of "FAQ"
Line 1: | Line 1: | ||
[[Category:Help]] | [[Category:Help]] | ||
− | + | {|align=right | |
− | =Storage= | + | |__TOC__ |
− | ==Navigating Blue and Orange Storage== | + | |} |
+ | ==Storage== | ||
+ | ===Navigating Blue and Orange Storage=== | ||
If you are listing /blue or /orange you won't see your group's directory tree. It's automatically connected (mounted) when you try to access it in any way e.g. by using an 'ls' or 'cd' command. E.g. if your group name is 'mygroup' you should list or cd into /blue/mygroup or /orange/mygroup. See also this short video: https://web.microsoftstream.com/video/87698fe6-84df-40dc-9d22-c3a6c63820fa | If you are listing /blue or /orange you won't see your group's directory tree. It's automatically connected (mounted) when you try to access it in any way e.g. by using an 'ls' or 'cd' command. E.g. if your group name is 'mygroup' you should list or cd into /blue/mygroup or /orange/mygroup. See also this short video: https://web.microsoftstream.com/video/87698fe6-84df-40dc-9d22-c3a6c63820fa | ||
− | ==No Space Left== | + | ===No Space Left=== |
If you see a 'No Space Left' or a similar message (no quota remaining, etc) check the path(s) in the error message closely to look for 'home', 'orange', 'blue', or 'red' and check the respective quota for that filesystem. All quota commands are in the [[UFRC_environment_module|'ufrc' environment module]] and include 'home_quota', 'blue_quota', 'orange_quota'. See [[Getting Started]] and [[Storage]] for more help. | If you see a 'No Space Left' or a similar message (no quota remaining, etc) check the path(s) in the error message closely to look for 'home', 'orange', 'blue', or 'red' and check the respective quota for that filesystem. All quota commands are in the [[UFRC_environment_module|'ufrc' environment module]] and include 'home_quota', 'blue_quota', 'orange_quota'. See [[Getting Started]] and [[Storage]] for more help. | ||
Line 11: | Line 13: | ||
In case you consider purchasing more storage, please visit [https://rc.ufl.edu/get-started/purchase-allocation/ the Purchase Allocation portal]. | In case you consider purchasing more storage, please visit [https://rc.ufl.edu/get-started/purchase-allocation/ the Purchase Allocation portal]. | ||
− | =Applications= | + | ==Applications== |
− | ==Custom Installation== | + | ===Custom Installation=== |
'''Q''': I want to have a custom install of an application or python modules. | '''Q''': I want to have a custom install of an application or python modules. | ||
Line 19: | Line 21: | ||
See also: [[Installing Personal Python Modules]] and [[Managing Python environments and Jupyter kernels]] | See also: [[Installing Personal Python Modules]] and [[Managing Python environments and Jupyter kernels]] | ||
− | ==Python== | + | ===Python=== |
'''Q''': Installed a python package via 'pip install something', but 'import something' results in an error. | '''Q''': Installed a python package via 'pip install something', but 'import something' results in an error. | ||
'''A''': A pip install you performed puts the resulting package into your personal directory tree located in the '''~/.local/lib/pythonX.Y/site-packages''' directory tree. A personal pip install can often result in an installation of a python package from a binary archive (wheel) that was built on a system against software libraries that are not compatible with HiPerGator. A typical error message in such case complains about the lack of a particular GLIBC version or some other missing library. Note that the issue can be exacerbated by an incompatible interaction between an environment loaded via an environment module ('module load something') and a personal python package install. To avoid this issue the python package must be installed into an isolated environment. Our approach for creating such environments depends on many factors, but usually results in a Conda or containerized environment. | '''A''': A pip install you performed puts the resulting package into your personal directory tree located in the '''~/.local/lib/pythonX.Y/site-packages''' directory tree. A personal pip install can often result in an installation of a python package from a binary archive (wheel) that was built on a system against software libraries that are not compatible with HiPerGator. A typical error message in such case complains about the lack of a particular GLIBC version or some other missing library. Note that the issue can be exacerbated by an incompatible interaction between an environment loaded via an environment module ('module load something') and a personal python package install. To avoid this issue the python package must be installed into an isolated environment. Our approach for creating such environments depends on many factors, but usually results in a Conda or containerized environment. | ||
− | ==R== | + | ===R=== |
'''Q:''' When I submit a job using 'parallel' package all threads seem to share a single CPU core instead of running on the separate cores I requested. | '''Q:''' When I submit a job using 'parallel' package all threads seem to share a single CPU core instead of running on the separate cores I requested. | ||
Line 60: | Line 62: | ||
$ R CMD INSTALL /path/to/package_version.tar.gz</pre> | $ R CMD INSTALL /path/to/package_version.tar.gz</pre> | ||
− | =Performance= | + | ==Performance== |
'''Q''': Why is HiPerGator running so slow? | '''Q''': Why is HiPerGator running so slow? | ||
Line 76: | Line 78: | ||
In case you consider purchasing more resources, please visit [https://rc.ufl.edu/get-started/purchase-allocation/ the Purchase Allocation portal]. | In case you consider purchasing more resources, please visit [https://rc.ufl.edu/get-started/purchase-allocation/ the Purchase Allocation portal]. | ||
− | =Why is my job still pending?= | + | ==Why is my job still pending?== |
According to SLURM documentation, when a job cannot be started a reason is immediately found and recorded in the job's "reason" field in the squeue output and the scheduler moves on to the next job to consider. | According to SLURM documentation, when a job cannot be started a reason is immediately found and recorded in the job's "reason" field in the squeue output and the scheduler moves on to the next job to consider. | ||
Line 82: | Line 84: | ||
Related article: [https://help.rc.ufl.edu/doc/Account_and_QOS_limits_under_SLURM Account and QOS limits under SLURM] | Related article: [https://help.rc.ufl.edu/doc/Account_and_QOS_limits_under_SLURM Account and QOS limits under SLURM] | ||
− | ==Common reasons why jobs are pending== | + | ===Common reasons why jobs are pending=== |
− | ;Priority: Resources being reserved for higher priority job. This is particularly common on Burst QOS jobs. Refer to the [https://help.rc.ufl.edu/doc/Account_and_QOS_limits_under_SLURM#Choosing_QOS_for_a_Job Choosing QOS for a Job] page for details. | + | {| |
+ | |- | ||
+ | | | ||
+ | ;Priority: Resources being reserved for higher priority job. This is particularly common on Burst QOS jobs. | ||
+ | * Refer to the [https://help.rc.ufl.edu/doc/Account_and_QOS_limits_under_SLURM#Choosing_QOS_for_a_Job Choosing QOS for a Job] page for details. | ||
;Resources: Required resources are in use | ;Resources: Required resources are in use | ||
;Dependency: Job dependencies not yet satisfied | ;Dependency: Job dependencies not yet satisfied | ||
;Reservation: Waiting for advanced reservation | ;Reservation: Waiting for advanced reservation | ||
;AssociationJobLimit: User or account job limit reached | ;AssociationJobLimit: User or account job limit reached | ||
+ | || | ||
;AssociationResourceLimit: User or account resource limit reached | ;AssociationResourceLimit: User or account resource limit reached | ||
;AssociationTimeLimit: User or account time limit reached | ;AssociationTimeLimit: User or account time limit reached | ||
Line 94: | Line 101: | ||
;QOSResourceLimit: Quality Of Service (QOS) resource limit reached | ;QOSResourceLimit: Quality Of Service (QOS) resource limit reached | ||
;QOSTimeLimit: Quality Of Service (QOS) time limit reached | ;QOSTimeLimit: Quality Of Service (QOS) time limit reached | ||
+ | |} |
Revision as of 18:44, 9 November 2022
Storage
If you are listing /blue or /orange you won't see your group's directory tree. It's automatically connected (mounted) when you try to access it in any way e.g. by using an 'ls' or 'cd' command. E.g. if your group name is 'mygroup' you should list or cd into /blue/mygroup or /orange/mygroup. See also this short video: https://web.microsoftstream.com/video/87698fe6-84df-40dc-9d22-c3a6c63820fa
No Space Left
If you see a 'No Space Left' or a similar message (no quota remaining, etc) check the path(s) in the error message closely to look for 'home', 'orange', 'blue', or 'red' and check the respective quota for that filesystem. All quota commands are in the 'ufrc' environment module and include 'home_quota', 'blue_quota', 'orange_quota'. See Getting Started and Storage for more help.
A convenient interactive tool to see what's taking up the storage quota is 'ncdu' in the 'ufrc' env. module.
In case you consider purchasing more storage, please visit the Purchase Allocation portal.
Applications
Custom Installation
Q: I want to have a custom install of an application or python modules.
A: We recommend creating a Conda environment and installing needed packages with the 'mamba' tool from the conda environment module. It is possible to mix conda and pip installed packages inside a conda environment as conda/mamba is aware of packages installed via pip, but not vice versa.
See also: Installing Personal Python Modules and Managing Python environments and Jupyter kernels
Python
Q: Installed a python package via 'pip install something', but 'import something' results in an error.
A: A pip install you performed puts the resulting package into your personal directory tree located in the ~/.local/lib/pythonX.Y/site-packages directory tree. A personal pip install can often result in an installation of a python package from a binary archive (wheel) that was built on a system against software libraries that are not compatible with HiPerGator. A typical error message in such case complains about the lack of a particular GLIBC version or some other missing library. Note that the issue can be exacerbated by an incompatible interaction between an environment loaded via an environment module ('module load something') and a personal python package install. To avoid this issue the python package must be installed into an isolated environment. Our approach for creating such environments depends on many factors, but usually results in a Conda or containerized environment.
R
Q: When I submit a job using 'parallel' package all threads seem to share a single CPU core instead of running on the separate cores I requested.
A: On SLURM you need to use --cpus-per-task to specify the number of available cores. E.g.
#SBATCH --nodes=1 #SBATCH --ntasks=1 #SBATCH --cpus-per-task=12
will allow mcapply or other function from the 'parallel' package to run on all requested cores
Q: How do I install R packages?
A: Users can install R packages in their local directory. The default directory is /home/my.username/R/x86_64-pc-linux-gnu-library/X.X/ (X.X = version number)
From a standard repository (such as CRAN-R)
$ module load R/X.X $ R > install.packages("PACKAGE")
From github
$ module load R/X.X $ R > devtools::install_github("author/software") or > remotes::install_github("author/software")
From a tarball
$ module load R/X.X $ R CMD INSTALL /path/to/package_version.tar.gz
Performance
Q: Why is HiPerGator running so slow?
A: There are many reasons why users may experience unusually low performance while using HPG. First, users should ensure that performance issues are not originated from their Internet service provider, home network, or personal devices.
Once the possible causes above are discarded, users should report the issue as soon as possible via the RC Support Ticketing System. When reporting the issue, please include detailed information such as:
- Time when the issue occurred
- JobID
- Nodes being used, i.e. username@hpg-node$. Note: Login nodes are not considered high performance nodes and intense jobs should not be executed from them.
- Paths, file names, etc.
- Operating system
- Method for accessing HPG: Jupyterhub, Open OnDemand, or Terminal interface used.
In case you consider purchasing more resources, please visit the Purchase Allocation portal.
Why is my job still pending?
According to SLURM documentation, when a job cannot be started a reason is immediately found and recorded in the job's "reason" field in the squeue output and the scheduler moves on to the next job to consider.
Related article: Account and QOS limits under SLURM
Common reasons why jobs are pending
|
|