Resource management
Introduction - Resource management
This section describes what the user is required to do to use computational resource.
As Nimrod is a user based application, there is no option for a global resource type. Each user will have to add their own resources. Alternatively, you can add resources for the user when you create the user in Nimrod.
Contents
This section has the following sections:
Before an experiment can run, Nimrod needs to know what computational resources it can use.
Nimrod allows users to use multiple methods of accessing their resources (E.g. Globus and PBS) and thus requires each user to enter in the information.
There are multiple methods that Nimrod supports to access computational resources. They are:
- Globus: The Globus Toolkit is a set of tools that enables users to use remote resources in a predefined way. If a user's resource has Globus installed, Nimrod can use the globus commands to use that resource. Globus resources can only be added if the user has globus credentials (see below).
- PBS: If the Nimrod server is a PBS submission machine, Nimrod can use PBS to access the PBS nodes. Nimrod is confined within the constraints of the queuing system (therefore a user cannot allocate more nodes than what PBS allocates).
- Fork: Nimrod supports using the Nimrod server to launch jobs. This has been designed for SMPs (also for testing). This resource option may not appear in all installations.
- Nimrod can support other queuing systems. Feel free to ask.
Adding a resource
In the resource management section of the Portal, the bottom item in the list of resources is blank item for the user to enter a new resource.
The options are:
- Resource type: Select the resource type. Note, the Globus type requires a default Globus certificate with a valid proxy certificate.
- Host and/or queue: Type in the resource details. For the fork resource type, this option will be ignored. For globus resources, this is <hostname>/<jobmanager>. For most other queuing systems, it should be in the form of <queue>@<queue server>.
- Max. agents: This setting allows the user to tell Nimrod, "What is the maximum number of requests at any given time I can submit to this resource?". As Nimrod will never request more that what it needs, this option is only needed if a resource's administrator asks to hold back the number of jobs. Set this value high or what the administrators suggest. There can never be more than this number of jobs "Executing" on this resource. If using Globus resources, setting this value to "0" will inform Nimrod to ask the resource's MDS to determine this value.
- Globus Proxy: Select the Globus credentials needed for this resource. This is only needed for the Globus resource type.
- Nimrod proxy?: Select this if you need to use the Nimrod Proxy for this resource. This is only needed if the compute nodes cannot connect with the Nimrod server. This is only needed for the Globus resource type.
All of these options can be changed later in the "Configure" option for the resource.
In addition to the above options, the configure window may ask for additional information if Nimrod was unable to discover it itself. These are:
- Architecture: What is the OS architecture of the remote resource? Nimrod uses this information to determine which "Nimrod Agent" to copy to the resource. If this is unknown, Nimrod cannot use that resource.
- Batch/Fork: Nimrod needs to know if the resource has a queuing system or not. Nimrod needs to know if it is dealing with a remote queuing system.
For Globus resources, "Configure" will run Nimrod's globus check to see if it can contact the resource.
The output shows firstly, if the resource's MDS is accessible, then it tries to run a simple program that will determine the resource's architecture. If the MDS is down, the Portal will normally ask "What is submission type is the resource?". If the portal ask about what architecture this resource is, it is most likly that the resource will not work. Please check the "Globus certificates" section below on why a resouce may not work.
What is Globus?
The Globus Toolkit is a set of tools that enables users to use remote resources in a predefined way.
To use a globus resource, a user must have a globus proxy certificate.
This certificate contains a unique identifier that remote resources use to identify the user.
Certificates need to be signed by a trusted authority (please talk with your administrators about obtaining one) and the resource that the user needs to use has to trust the signing authority (it needs the signing authority's public key in its trust directory).
Further to this, the Nimrod server needs to trust the user's remote resource server certificate.
Here are a few things the user needs to do before using a globus resource.
- They need to obtain a certificate from a trusted authority.
- This certificate will have a subject that the remote resource needs to map to the user's account name.
- The remote resource will have to add the user's signing authority's public certificate to their list of trusted CAs.
- The administrator of the Nimrod server will need to add the signing authority of the remote server to the server's list of trusted CAs.
- Finally, the user will need to create a proxy certificate from their certificate (see below).
This is just a brief explanation of Globus credentials.
Please see the Globus Toolkit for more information.
Managing certificates on the portal
Uploading a certificate
As most of the certificate side of Globus involves administrators, there is not much left to the user to do.
When a user has obtained a certificate, they need to upload it to the portal. This is done under the certificate section in the "Resource management" page.
The user will need to give this certificate a user friendly name that the portal will identify this certificate with. Certificates require two parts, the certificate file and the certificate's key file. The user can either select the files on the user's local file system or paste them into text boxes.
Creating a proxy certificate
Once a user has a valid Globus certificate, they will need to create a proxy certificate.
Proxy certificates are used to connect with the remote resources.
They differ to an ordinary Globus certificate as they do not require a password or passphrase to be used.
Because of this, it is recommended that the user only creates a certificate with only the time they require.
Proxy certificates can be recreated during experiments.
To generate a proxy, select the "Generate Proxy" action button in the Certificate section under the "Resource management" page. This will require your certificate passphrase and the amount of time required for the proxy.
Uploading a proxy certificate
Nimrod does not use Globus certificates, but it uses the Globus proxy certificates. If the user has generated a proxy certificate else where, they can upload it in the certificate section of the "Resource management" page.
It is required that the user gives this a user friendly name.