For efficient management of large clusters providing grid services on each machine, central repositories of GARs are essential. The GarmB tool enables the creation of repositories using XML descriptors to minimise network traffic. In theory, using a shared NFS directory rather than a descriptor-based repository generates configuration related traffic of 50 to 2000 times the volume.
There are two parts to using repositories. Firstly, on each machine that will use the repository, the environment variable GARM_REPOSITORY must be set to the URL of the repository. GARM will resolve all remote references as being relative to this URL. This is all that is required at the client side.
The GARM tool GarmB should be used to create the repository data. The files themselves can be served by a standard webserver or ftp server. Each descriptor GarmB creates has a pointer inside to the location of the GAR it describes. When the descriptor is retrieved, if the GAR itself is required, this URL will be used to retrieve the GAR. One consequence of this is that the machine that serves the descriptors need not be the machine that serves the GARs themselves.
GarmB has the following syntax:
garmb url dir files
The URL argument specifies the prefix that should be applied to each files relative path to create the URL pointer to that GAR. The dir argument should be set to where you wish the generated descriptors to be placed. Finally the files argument is a list of GARs to create descriptors for. If directories are given in the files argument, they will be recursively searched for GARs and any GARs found will be described.
For example, suppose we have a machine which we wish to use as a repository. We have our GARs under /disk1/gars/ which is available by HTTP as Suppose we want our descriptors to be under /disk2/descriptors, we can invoke GarmB (in the directory /disk1) as follows:
garmb /disk2/descriptors gars/
The directory which GarmB is invoked in is important because the relative path to each GAR is appended to the URL given. If for example we had invoked GarmB from some other location, say /home/idiot/ then the generated descriptors would all refer to which is not what we want.
The descriptors generated by GarmB are location independent - they can be moved around and copied freely, and will always be correct within the scope of the URLs used to create them (if the descriptors point to machines only visible in some scope they will be useless outside that scope obviously.) Descriptors have the extension .garx for GAR files.