Introduction
Generally speaking, advance reservation blocks requested resources from other jobs over a defined time interval. Most of the commonly used Distributed Resource Managament Systems (DRMSs) have recently added advance reservation functionality. We reckon there is a need for one consistent AR API (Advance Reservation API), which could be implemented for many different DRMSs, thus giving the user a sort of independence.
Main AR API routines have been described below. The library design has been based upon DRMAA (Distributed Resource Management API Application) and LSF API. This explains similar naming convention and proposed routines.
Routines
Library initialization
1 int ar_init(const char *contact,
2 char *error_diagnosis, size_t error_diag_len);
Initializes the library. Required before using any AR API calls which need to communicate with DRM. This routine can be called only once before corresponding ar_exit call.
The optional and implementation dependant contact string may denote how to connect to underlaying DRM.
Library deinitialization
1 int ar_exit(char *error_diagnosis, size_t error_diag_len);
Cleans up and closes the library.
Add reservation
1 int ar_add_reservation(
2 char *ar_id, size_t ar_id_len,
3 const ar_reservation_template_t *art,
4 char *error_diagnosis, size_t error_diag_len
5 );
Adds reservation with specification defined in art. The argument ar_id stores the advance reservation id received from DRMS.
Currently several requirements may be defined in ar_reservation_template_t:
1 /* non-vector reservation attributes */
2 #define AR_RESERVATION_TYPE "ar_reservation_type"
3 #define AR_RESERVATION_NAME "ar_reservation_name"
4 #define AR_RESERVATION_START_TIME "ar_reservation_start_time"
5 #define AR_RESERVATION_DURATION "ar_reservation_duration"
6 #define AR_ARCHITECTURE "ar_architecture"
7 #define AR_SOFTWARE "ar_software"
8 #define AR_NICENESS "ar_niceness"
9 #define AR_N_NODES "ar_n_nodes"
10 #define AR_N_CPUS "ar_n_cpus"
11 #define AR_CPU_TIME "ar_cpu_time"
12 #define AR_MEMORY "ar_memory"
13 #define AR_LOCKED_MEMORY "ar_locked_memory"
14 #define AR_ADDRESS_SPACE "ar_address_space"
15 #define AR_FILESYSTEM_SPACE "ar_filesystem_space"
16 #define AR_HOME_SPACE "ar_home_space"
17 #define AR_TMP_SPACE "ar_tmp_space"
18 #define AR_SWAP_SPACE "ar_swap_space"
19
20 /* vector attributes */
21 #define AR_V_NODES "ar_v_nodes"
Remove reservation
1 int ar_remove_reservation(const char *ar_id,
2 char *error_diagnosis, size_t error_diag_len);
Deletes reservation with a given ar_id.
Check reservation
1 int ar_check_reservation(
2 const char *ar_id,
3 ar_information_template_t **ari,
4 char *error_diagnosis, size_t error_diag_len
5 );
Checks the reservation with a given ar_id and retrieves information about it from DRMS. This is still in conceptual phase since different DRMS describes advance reservation in different categories. For example Sun Grid Engine uses the idea of "slots", while LSF uses simply the "processors". Both of this terms are not 100% equivalent. Currently an effort is put to find and design a consistent set of advance reservations' properties (metrics) which could be implemented in every (most) DRMS and would be useful for the user.
Job submission
Job submission is not covered by this API. One should use API design to job submission and control, particularly DRMAA. During the submission time, there is a possibility to pass in a job template a DRMS dependant attribute (this is called drmaa_native_specification). This can be used to pass an advance reservation id into which job is supposed to be submitted.
FedStage implementations of DRMAA for PBS Pro and LSF support --arid option within native specification string. This option accepts as an argument the identifier of reservation in which job should be submitted.
Information routines
1 int ar_get_AR_implementation(char *ar_impl, size_t ar_impl_len,
2 char *error_diagnosis, size_t error_diag_len);
Get information about this AR API implementation.
1 int ar_version(unsigned int *major, unsigned int *minor,
2 char *error_diagnosis, size_t error_diag_len);
Gets information about this AR API version.
1 int ar_get_DRM_system(char *drm_system, size_t drm_system_len,
2 char *error_diagnosis, size_t error_diag_len);
Gets information about underlying DRM system.
Errors
1 const char *ar_strerror(int AR_errno);
Gets error string for a given error code.
There are several currently defined error codes:
1 #define AR_ERRNO_SUCCESS 0
2 #define AR_ERRNO_INTERNAL_ERROR 51
3 #define AR_ERRNO_DRM_COMMUNICATION_FAILURE 52
4 #define AR_ERRNO_AUTH_FAILURE 53
5 #define AR_ERRNO_INVALID_ARGUMENT 54
6 #define AR_ERRNO_NO_MEMORY 55
7 #define AR_ERRNO_INVALID_ATTRIBUTE_FORMAT 56
8 #define AR_ERRNO_INVALID_ATTRIBUTE_VALUE 57
9 #define AR_ERRNO_CONFLICTING_ATTRIBUTE_VALUES 58
10 #define AR_ERRNO_TRY_LATER 59
11 #define AR_ERRNO_DENIED_BY_DRM 60
12 #define AR_ERRNO_INVALID_AR 61
13 #define AR_ERRNO_NO_MORE_ELEMENTS 62
14 #define AR_ERRNO_NOT_IMPLEMENTED 63
Auxiliary routines
There is a set of routines designed for templates handling, like for example ar_reservation_template_t or ar_information_template_t. They mainly allow setting and getting values from those opaque structures and are analogous to those used in DRMAA. They function is purely to ease the user's implementation effort and they do not influence the main functionality of AR API, thus they are not described here.
Implementation
Currently implementation of AR API for three different DRMSs is under way:
- AR API for LSF
- AR API for PBS PRO
- AR API for SGE
Unfortunately there are differences in maintaining ARs in distinct DRMSs. Minor changes to the AR API may take place in order to make it more coherent and "implementable" in the context of various DRMSs (as for example AR information retrieval). However the general ideas and concepts described above should not change.
Developers Network