Installation requirements | A description on what is required to run the portal. |
Installation location | Information in where to install the portal. |
Installation from source | Instruction on how to compile binaries from the source. |
Install binaries | Instructions on how to run the portal. |
Configure WWW install | Steps required in configuring the portal. |
Configure remote WWW install | Steps required in configuring a remote web server, separate to the portal/Nimrod server. |
Set up the Nimrod Portal database | The Nimrod Portal database |
MOTD and the disclaimer | Information on the MOTD and the Disclaimer. |
Configuring the Nimrod bank | How to configure Nimrod to use the Nimrod bank and configuring resources to report their cost. |
Configuring Nimrod to use user certificate authentication | Enable users to validate with certificates instead of using their username and password. |
Diagnosing problems | What to check and turning on debugging modes. |
Common problems | Some common problems of the installation. |
Installation location | top |
The Portal does not have an installation program so we suggest you unpack the distribution directly into the install directory.
As the users will never need to explore this directory, we suggest you chmod 711 nimrodportal to discourage curious users.
Installation requirements | top |
The Nimrod Portal has been tested under Apache web server on Linux, SunOS and NetBSD with out any changes to the source. As the web server is only used to launch the portal's CGI script, there should not be many problems on other systems.
These instructions assume that the web server is installed on the Nimrod server. This is the easiest way to install the Nimrod Portal, however, it is possible to have the web server separate from the Nimrod install. We suggest you follow the install instructions and install on the Nimrod server. Then examine the two example files, np.ssh.cgi.example and np.sshd.example in the sbin directory. You will need a ssh key pair without a passphrase to connect to a user without rights on the Nimrod server.
NOTE: We have observed that ssh can use a lot of load on the web server. Please ensure that you have a device called /dev/*random and that ssh is enabled to use it.
As Nimrod's database backend currently is PostGres SQL, the Portal has been developed with PostGres. However, it may work with other SQL implementations, but have not found the need as Postgres is already installed with Nimrod.
The portal assumes that postgres has been configured with a default database set for every user. So when the Nimrod Portal calls upon Nimrod, Nimrod will connect to the user's default database. Postgres normally does not have a problem as it normally uses the user's name as the default database name. But we have had some installations that have no default.
Although the portal requires Nimrod, it does not need Nimrod to install. If the portal has been correctly installed, it will constantly report errors to the user until Nimrod is installed on the server.
Installation from source | top |
Instructions:
Install binaries | top |
The directory where you have installed the portal files does not need to be added to the system path. However, if you have following external packages that integrate with the portal, you will need to copy them to the ${NIMRODPORTAL_INSTALL}/bin directory.
These packages are:
| If you have Nimrod/O version 2.5 or greater, copy the nimrodo and the Nimcache (and gamut if setting up Nimrod/OI) executables to the nimrodportal's bin directory. Instruction on configuring Nimrod/O for the portal are located in the Set up the Nimrod Portal Database section. |
| The Nimrod Portal can use the EnFuzion enfgenerator as an alternative to the Nimrod's one. This version gives better feedback when there are errors in the user's .pln files. Copy enfgenerator to the nimrodportal's bin directory. No further configuration is required. |
Now that you have the portal configured, you'll need to set up the cgi script and Java/Image directories.
Configure Web install | top |
It is strongly advised that the portal is installed on an https server. When the user logs in, a clear text password is sent to the portal server. Using https, this is encrypted. |
The options descriptions are:
| This must point to the location of where Nimrod is installed. |
| This must point to the location of where postgres in installed. It must be one directory down of its bin directory. E.g. if the psql command is in /usr/bin/psql, then this option must be /usr |
| This must point to the location of where the nimrod portal has been installed. It must be one directory down from the bin directory. |
| This must be the URL to parent of the npapplet and the images directory. |
| This must be the URL to this CGI script. The Nimrod Portal uses this to generate the URLs. |
| This must be set to the path of the Globus install. Both Nimrod and the Portal use this path to use Globus resources. |
| This must be set to the path of the PBS bin install. Nimrod uses this path to work with PBS. |
| This must be set to the full path of the Condor config file. Nimrod uses this file to work with Condor. |
| Nimrod uses these variables to work with SGE. |
There are other options that may need to be changed in non-standard installations.
#!/usr/bin/env bash ##################################################################### # # This is the example file for the main cgi script used for the # Nimrod Portal. It needs to be customised to your resource. # ##################################################################### # Options used by Nimrod # Database type NIMROD_DATABASE=pgsql # Nimrod install NIMROD_INSTALL=/opt/nimrod-3.2.0 # The PSQL installation directory PSQL_INSTALL=/usr PSQL_LOCATION=${PSQL_INSTALL}/bin # The bin directory of qsub for PBS PBS_LOCATION=/usr/pbs/bin # The Condor config file CONDOR_CONFIG=/usr/condor/etc/condor_config # The type of sun grid engine installed SGE_ARCH=glinux # The root directory SGE SGE_ROOT=/opt/gridengine # Options used by the Portal # This install directory of the portal. NIMRODPORTAL_INSTALL=/opt/nimrodportal # This is the common URL for the images and applets. The portal will # append /images/ or /nvapplet/ to this URL. NIMRODPORTAL_URL=https://nimrod.csse.monash.edu.au # This is the URL for this script. NIMRODPORTAL_BIN_URL=https://nimrod.csse.monash.edu.au/cgi-bin/np.cgi # This is the command the Portal will execute to establish a connection # to the SQL database. This is done as root. The results back must # be comma separated. The portal also expects that newline will # terminate each request. The following example should be on one line. # And the comma must appear exactly after the -F NIMRODPORTAL_SQL_CMD="${PSQL_LOCATION}/psql -t -A -F, -U nimrodportal -d nimrodportal -S" # Globus install path GLOBUS_LOCATION=/opt/globus/2.4 #################################################################### # # Options below may only need to be modified if you are having # problems with library paths etc. # #################################################################### export NIMROD_DATABASE export NIMROD_INSTALL export NIMRODPORTAL_INSTALL export NIMRODPORTAL_URL export NIMRODPORTAL_BIN_URL export NIMRODPORTAL_SQL_CMD export PSQL_INSTALL export PSQL_LOCATION export GLOBUS_LOCATION export PBS_LOCATION export CONDOR_CONFIG export SGE_ARCH export SGE_ROOT PATH=$NIMROD_INSTALL/bin:$PATH:/bin:/sbin:/usr/bin:/usr/sbin PATH=$PATH:/usr/X11R6/bin:/usr/local/bin:/usr/local/sbin PATH=$PATH:$GLOBUS_LOCATION/bin:$PBS_LOCATION export PATH PATH=$PSQL_INSTALL/bin:$NIMRODPORTAL_INSTALL/bin:$PATH ; export PATH LD_LIBRARY_PATH=$GLOBUS_LOCATION/lib:$LD_LIBRARY_PATH export LD_LIBRARY_PATH LIBPATH=$LIBPATH:$LD_LIBRARY_PATH ; export LIBPATH # Check that directory structure is accessible if [ ! -x $NIMRODPORTAL_INSTALL/bin/nimrodportal.cgi ] then echo Content-type: text/plain echo echo -n User \'`whoami`\' with group access\(es\) \'`groups`\' echo is unable to execute \'$NIMRODPORTAL_INSTALL/bin/nimrodportal.cgi\' exit fi # Launch the portal. exec $NIMRODPORTAL_INSTALL/bin/nimrodportal.cgi 2>/dev/null |
Configure remote Web install | top |
The location of installed programs looks like this:
Computer A | Computer B |
---|---|
|
|
Computer A | Computer B |
---|---|
All files on Computer A are located in the cgi-bin directory and must be owned by the web-server user and readable by only the owner (chmod og-rwx known_hosts nimrodportal_id):
|
|
Obtaining these files can be done this way:
To test this, on Computer A change into the directory where the np.ssh.cgi script is and type this command:
ssh -o BatchMode=yes -o UserKnownHostsFile=known_hosts -x -i nimrodportal_id <nimrod portal username>@<nimrod portal host> echo test
The word "test" should be returned. If not, please make sure you have the correct known_hosts and nimrodportal_id files.
You will also need to set some environment settings in the np.ssh.cgi to instruct the script where the Nimrod Portal is installed. The two setting are: SSH_DESTINATION (in the form of username@nimrod.server) and NIMRODPORTAL_INSTALL which is the location of where the Nimrod Portal is installed on the Nimrod server (Computer B).
If successful, all connections to the cgi script on the web server (Computer A) will then redirect down SSH to Computer B's Nimrod Portal install.
Set up the Nimrod Portal database | top |
createuser -A -D nimrodportal
Creating the database is described below. The nimrodportal user and database must be protected from other users. Here are a few tips:
This line must appear in the pg_hba.conf file before any other match. That is, it must be before the "local any any" line.
pg_hba.conf |
local nimrodportal nimrodportal ident nimrod |
As the Nimrod portal is launched via the apache user and then the super user ID is set, you may need to map both apache and the root user to map into the nimrodportal database user.
pg_ident.conf |
nimrod root nimrodportal nimrod apache nimrodportal |
All of these scripts read the sbinenv.conf file to find out the location of the Postgres commands. You may need to edit this file to include the full paths of the commands.
We suggest you create a script called npinit and have something like this in it:
#!/bin/sh # Init the database ./npcreatedb # Add the administrators ./addadmin cme ./addadmin slavisa # Add plugins ./nimrod_plugin ./nimrodO_plugin ./nimrodOI_plugin ./Manual_plugin ./GridWorld_plugin ./globus_plugin ./fork_plugin ./pbs_plugin |
Congratulations: You are now ready to log into the Nimrod Portal!
MOTD and the Disclaimer message | top |
The command resetmotd located in the sbin directory will reset all users to see the motd again. This should be run after the file is updated so that current loggedin users will see the new message.
The Nimrod Portal cannot be used by a user until the user accepts.
The command resetdisclaimer located in the sbin directory will reset all users to be prompted to acknowledge again.
Configuring the Nimrod bank services | top |
This will cause the Nimrod Portal to try to contact the bank to determine if the user has enough credit. To configure which bank to use, you will now configure the command the Portal will use to contact the bank. This is done in the $NIMRODPORTAL_INSTALL/bin script called accessbank. The example installed will only need the hostname and path to the bank's scipt updated. (Note, you may remove the globus command and host and use your locale server as the bank server, however, user's still must have a default proxy in order to be identified by the bank)
You will also need to add yourself as a bank manager. To do this, add your Globus certificate subject to the file $NIMRODPORTAL_INSTALL/bankdata/manager. Your subject must be quoted. Once you have done this, you can modify other user's balances under your balance section in the resource management section on the Portal.
The Nimrod Portal will run the command located in the file /etc/nimrodresourcecost.conf on the remote resource. If this file or the command that it refers to is not there, then no cost information will be reported.
Example of the /etc/nimrodresourcecost.conf file:
/usr/bin/nimrodresourcecost |
The output from the program /usr/bin/nimrodresourcecost is of expected format. For a resource that has no quota, an example script will look like:
#!/bin/sh echo CostPerHour: 20 echo UserQuotaHoursLeft: -2 echo QuotaExpDate: Never |
The only no obvious setting is UserQuotaHoursLeft being -2. -2 represents unlimited quota. It is not used directly by Nimrod, but displayed to the user for the user to decide how to use the resource.
A more complicated example may look like this:
#!/usr/bin/env bash echo CostPerHour: 10 /opt/rash/bin/quotasu -P g14 | /bin/awk '{if ($1 == "Service" && $2 == "Units" && $3 == "remaining") print "UserQuotaHoursLeft: " $4}' /opt/rash/bin/nf_limits | /bin/awk 'BEGIN{doit=0}{ if (doit == 1 && $2 != "") print "QueueRatio-" $1 ": " $2}{if (doit == 1 && $1 == "---------------------------------------------------------------------------") doit = 2}{if (doit == 0 && $1 == "---------------------------------------------------------------------------") doit = 1}' MONTH=`/bin/date '+%m'` if [ $MONTH -le 3 ] then echo QuotaExpDate: 31-Mar-`date '+%Y'` else if [ $MONTH -le 6 ] then echo QuotaExpDate: 30-Jun-`date '+%Y'` else if [ $MONTH -le 9 ] then echo QuotaExpDate: 30-Sep-`date '+%Y'` else echo QuotaExpDate: 31-Dec-`date '+%Y'` fi fi fi |
This example is taken from the National Facility's lc cluster at APAC. This first section of the script outputs the cost to use the resource. The next line runs a command to determine how many CPU hours the user has left in their quota. The third command outputs the cost ratio for each of the queues. (E.g. the express queue costs three times the amount of the default queue). The last section determines when the user's quota will expire. In the case of the lc cluster, every three months. This script will output this:
CostPerHour: 10 UserQuotaHoursLeft: 50.00 QueueRatio-express: 3 QueueRatio-normal: 1 QueueRatio-bonus: 0 QueueRatio-interactive: 2 QueueRatio-copyq: 1 QuotaExpDate: 31-Dec-2004 |
The Nimrod Portal expects these values in this format. To check if the script is working, you can always select the "Collect information" button on the "Resource Management" page.
Configuring Nimrod to use user certificate authentication | top |
Diagnosing problems | top |
Portal Page | Nimrod command(s) |
---|---|
Experiments | nimrod portalapi getexperiments |
An experiment | nimrod portalapi getexpinfo <expname> nimrod portalapi getdeadline <expname> |
Creating an experiment (also used in reset) | nimrod portalapi addrun <expname> <exptype> |
Deleting an experiment (also used in reset) | nimrod portalapi delrun <expname> |
Saving a plan file | nimrod generate <plan file> |
Starting an experiment | nimrod portalapi startexp <expname> |
Stopping/Pausing an experiment | nimrod portalapi stopexp <expname> |
Setting a deadline in an experiment | See Nimrod Manual about:nimrod portalapi setdeadline |
Allocating a resource to an experiment | nimrod portalapi addserver <expname> <host> <resource type> nimrod portalapi removeserver <expname> <host> <resource type> |
Adding/modify a resource | nimrod resource add <resource type> <host> nimrod portalapi setarch <resource type> <host> <arch> nimrod portalapi setgproxy <resource type> <host> <certificate> |
Removing a resource | nimrod resource remove <resource type> <host> |
Checking a resource | nimrod resource check <resource type> <host> |
The Nimrod Job viewer applet | nimrod enfapi <Experiment name> <EnFuzion API call> |
The Portal also maintains Globus credentials. Pleas ensure that these command work at the user's prompt.
Portal Page | Globus command(s) |
---|---|
Resource Management Page | $GLOBUS_LOCATION/etc/globus-user-env.sh grid-cert-info -subject -file <Certificate file name> grid-cert-info -enddate -file <Certificate file name> grid-proxy-info -subject -file <Proxy certificate file name> grid-proxy-info -timeleft -file <Proxy certificate file name> |
Create Proxy | $GLOBUS_LOCATION/etc/globus-user-env.sh grid-proxy-init -cert <Certificate file name> -key <Certificate key's file name> -out <Destination proxy file name> -pwstdin -bits 1024 -hours <hours> |
This log file has been set up for debugging purposes and does not have a friendly interface. Please follow these instructions carefully and the portal will crash without giving any reason why if this is done wrong.
Follow these instructions after installing the portal. The log file must be created and must be world readable and writeable, so we need to create the log file first. Do:
cd log touch runtime.log chmod 666 runtime.log chmod a+x .Now we have to tell the portal to use this file. The Nimrod Portal binary file output debug information to stderr. As this is normally redirected to /dev/null in the np.cgi script, we'll need to redirect to this new file. Change the line that reads:
$NIMRODPORTAL_INSTALL/bin/nimrodportal.cgi 2>/dev/nullto redirect it to the full path of the log file. E.g.:
$NIMRODPORTAL_INSTALL/bin/nimrodportal.cgi 2>> $NIMRODPORTAL_INSTALL/log/runtime.logNow that is done, we'll need to get the information we need from it. The best way to do this is to ready the portal just before the error and then clear the log with the command:
echo -n "" > runtime.logReproduce the error and copy (not move) the log file to a permanent place.
To turn off logging, change the line we modified in the np.cgi script back to what is was. We recommend this as the log file gets large very quickly.
Common problems | top |
Problem: | The portal complains about Nimrod errors and in the output of the Nimrod error there is a message like "exited with code 127: error while loading shared libraries: libglobus_gsi_proxy_core_gcc32.so.0: cannot open shared object file: No such file or directory". |
Solution: | Add the full path of the globus lib directory to /etc/ld.so.confi (E.g. /opt/gtk-2.4/lib) and then run ldconfig as root. |
Problem: | The portal complains about Nimrod errors and in the output of the Nimrod error there is a message like "psql (pg_wrapper): No database specified". |
Solution: | Nimrod assumes that psql knows which database to use for a given user as this is the default on some Linux distributions. If this is not the case, please add "export PGDATABASE=$USER" to $NIMRODPORTAL_INSTALL/bin/usersetenv. This will tell psql to use the database with the same username as the user. |
Problem: | The portal complains about not having execute access to the Nimrod Portal binary. |
Solution: |
There are two main causes of this. First, check that permissions on the $NIMRODPORTAL_INSTALL/bin directory (and all parent directories) are group owned by a group that the user apache belongs to.
Second, you might have SELINUX installed. SELINUX will stop the web server from accessing any files it does not have SELINUX permission for. Adding the Nimrod Portal install to SELINUX will not fix the problem as eventually, the process will need access to user accounts, Globus install and any other system files used. If you do not need the security given by SELINUX, you can disable it, or otherwise, you can follow the instructions above Configure remote Web install. |
top |