Difference between revisions of "Open OnDemand Troubleshooting"

From UFRC
Jump to navigation Jump to search
 
Line 9: Line 9:
 
The first thing you should do when encountering an issue with Open OnDemand is to check the logs. The logs can provide useful information about what went wrong and why.
 
The first thing you should do when encountering an issue with Open OnDemand is to check the logs. The logs can provide useful information about what went wrong and why.
  
You can find the logs in the /var/log/ondemand-nginx directory. Look for log files that correspond to the time when you encountered the issue. The logs are typically named access.log and error.log. You can use the tail command to view the last few lines of a log file:
+
You can find the logs in the ~/ondemand/data/sys/dashboard/batch_connect/sys/APPLICATION directory where APPLICATION is the name of the tool you are running. Inside that directory there will be a number of subdirectories with unique names matching the 'Session ID's from your [https://ood.rc.ufl.edu/pun/sys/dashboard/batch_connect/sessions OOD Session List]. Change into the appropriate directory and page through the 'output.log' log. The error is likely to be listed at the bottom of the file and say something obvious like 'out of disk space' or 'Some of your processes may have been killed by the cgroup out-of-memory handler'.
  
tail -f /var/log/ondemand-nginx/access.log
+
=== Full Home Directory ===
tail -f /var/log/ondemand-nginx/error.log
 
  
{| cellpadding="20"
+
If you are unable to access your files or launch jobs, it may be due to your home directory being full. Check your disk usage with the command du -sh ~/* or use the 'ncdu' tool from the 'ufrc' environment module in the terminal. If you find that your home directory is full, you remove unnecessary files. Make sure to move all data used in jobs to the Blue filesystem to avoid filling up the home directory and violating the [https://www.rc.ufl.edu/documentation/policies/storage/ RC storage policy].
|- style="vertical-align:top;"
 
|
 
=== Full Home Directory ===
 
  
If you are unable to access your files or launch jobs, it may be due to your home directory being full. Check your disk usage with the command du -sh ~/* in the terminal. If you find that your home directory is full, you can try removing unnecessary files or requesting additional disk space.
 
||
 
 
=== Proxy Errors ===
 
=== Proxy Errors ===
  
 
If you are unable to connect to Open OnDemand, you may be experiencing issues with your network proxy settings. Try configuring your proxy settings to allow connections to the ondemand.rc.ufl.edu domain. If you are still unable to connect, please contact us for assistance.
 
If you are unable to connect to Open OnDemand, you may be experiencing issues with your network proxy settings. Try configuring your proxy settings to allow connections to the ondemand.rc.ufl.edu domain. If you are still unable to connect, please contact us for assistance.
|-
 
|
 
  
 
=== Clear Browser Cache ===
 
=== Clear Browser Cache ===
  
 
If you're encountering issues with the OnDemand web interface, try clearing your browser cache. This can often resolve issues with outdated or corrupted cached files.
 
If you're encountering issues with the OnDemand web interface, try clearing your browser cache. This can often resolve issues with outdated or corrupted cached files.
||
 
=== Restart the OnDemand Services ===
 
 
Sometimes restarting the OnDemand services can resolve issues. You can restart the services by running the following commands:
 
 
systemctl restart httpd24-httpd ondemand-nginx
 
|-
 
|
 
=== Check File and Directory Permissions ===
 
  
Make sure that the OnDemand files and directories have the correct permissions. You can check the permissions by running the following command:
 
 
ls -l /opt/ood
 
 
Make sure that the files and directories are owned by the correct user and group, and that the permissions are set correctly.
 
||
 
=== Check Configuration Files ===
 
 
Make sure that the OnDemand configuration files are set up correctly. You can check the configuration files by running the following command:
 
 
ls /etc/ood/config
 
 
Make sure that the files exist and are set up correctly. If you need to modify
 
|-
 
|
 
 
=== Check Network Connectivity ===
 
=== Check Network Connectivity ===
  
Line 62: Line 30:
  
 
Replace <ondemand_server_ip> with the IP address of the OnDemand server.
 
Replace <ondemand_server_ip> with the IP address of the OnDemand server.
||
 
=== Check SELinux Settings ===
 
 
If you're encountering issues with permissions, check the SELinux settings on the OnDemand server. You can check the SELinux status by running the following command:
 
 
sestatus
 
 
If SELinux is enabled, you may need to modify the settings to allow the OnDemand services to access the necessary files and directories.
 
|-
 
|
 
=== Check the OnDemand Status ===
 
 
You can check the status of the OnDemand services by running the following command:
 
 
systemctl status httpd24-httpd ondemand-nginx
 
 
This will show you whether the services are running or not. If a service is not running, you can start it by running the following command:
 
 
systemctl start <service_name>
 
 
Replace <service_name> with the name of the service that is not running.
 
||
 
=== Check Firewall Rules ===
 
 
Make sure that the firewall on the OnDemand server is not blocking incoming traffic. You can check the firewall rules by running the following command:
 
 
iptables -L
 
 
If you need to open a port, you can do so by running the following command:
 
 
iptables -A INPUT -p tcp --dport <port_number> -j ACCEPT
 
 
Replace <port_number> with the number of the port that you want to open.
 
|}
 

Latest revision as of 14:27, 26 September 2023

Open OnDemand is a web platform for accessing and managing HPC resources. If you're encountering issues while using Open OnDemand, here are some common troubleshooting steps you can take:

Check the Logs

The first thing you should do when encountering an issue with Open OnDemand is to check the logs. The logs can provide useful information about what went wrong and why.

You can find the logs in the ~/ondemand/data/sys/dashboard/batch_connect/sys/APPLICATION directory where APPLICATION is the name of the tool you are running. Inside that directory there will be a number of subdirectories with unique names matching the 'Session ID's from your OOD Session List. Change into the appropriate directory and page through the 'output.log' log. The error is likely to be listed at the bottom of the file and say something obvious like 'out of disk space' or 'Some of your processes may have been killed by the cgroup out-of-memory handler'.

Full Home Directory

If you are unable to access your files or launch jobs, it may be due to your home directory being full. Check your disk usage with the command du -sh ~/* or use the 'ncdu' tool from the 'ufrc' environment module in the terminal. If you find that your home directory is full, you remove unnecessary files. Make sure to move all data used in jobs to the Blue filesystem to avoid filling up the home directory and violating the RC storage policy.

Proxy Errors

If you are unable to connect to Open OnDemand, you may be experiencing issues with your network proxy settings. Try configuring your proxy settings to allow connections to the ondemand.rc.ufl.edu domain. If you are still unable to connect, please contact us for assistance.

Clear Browser Cache

If you're encountering issues with the OnDemand web interface, try clearing your browser cache. This can often resolve issues with outdated or corrupted cached files.

Check Network Connectivity

Ensure that your computer is connected to the network and that you can reach the OnDemand server. You can use the ping command to test network connectivity:

ping <ondemand_server_ip>

Replace <ondemand_server_ip> with the IP address of the OnDemand server.