Analysis Environment Distributed Computing Requirements
Support Multiple Distributed Computing Models
Distributed systems such as ROOT/PROOF, some MPI and PVM implimentations, etc., make assumptions about the execution environment that can hinder deployment. For example, some MPI implementations assume that jobs on remote machines will be started via "rsh" or "ssh", which causes grief adapting parallel jobs to run in batch queues. Ideally, our analysis environment should not force a particular computing model, but should be adaptable to a variety of scheduling scenarios and authentication systems. Use cases that should be supported include:
- Execution on a single system, such as a personal laptop
- Execution on an ad-hoc collection of systems (e.g., all the systems in an office), with automatic discovery of available systems on the local network
- Batch queue execution, either as a parallel job with a group of systems reserved, or using whatever systems are available
- Execution on a dedicated cluster, supporting dynamic allocation of computing resources between competing analysis jobs
The variety of usage patterns means that multiple models must be supported for execution, authorization (including credential delegation) and possibly data discovery. Wherever possible, support for particular computing models should be delegated to plugin modules or external programs.
Authentication and Authorization
The variety of common authentication methods supported by
OpenSSH, plus the common GRID authentication methods (principally X.509 certificates) is a reasonable starting point. Even for execution modes where each process is pre-authenticated (e.g., by a batch queue gatekeeper function), authentication of the communications between the distributed processes will be required. GSSAPI should be used where possible.
Authorization should include some form of group selection procedure, so that a single user could have multiple groups of AE jobs working on different analyses. Support for lists of groups, with dynamic reassignment between the groups, is desirable.
Dynamic, Resilient and Opportunistic
For a high quality "out of the box" customer experience, the default configuration should support the use cases that do not involve a central, managed installation of the product. For these modes, the system should have good fault-tolerance characteristics, accommodating the dynamic addition and removal of available computing resources.
Client disconnect/reconnect must be supported in the modes of operation that use an intermediated web service. Fault-tolerance for the web service is out of scope for the AE prototype. Graceful departure of a worker from a computation at an agreed upon time would be very desirable, particularly for execution in mixed-use batch queues. Recovery from unplanned worker (without starting over from the beginning of the calculation) would depend on the ability to resume a calculation on a different worker at the point of failure, or rolling the results of the failed worker back to a check or commit point.
Client Side Code Independence
The analysis environment should not be dependent upon the software versions used by the client, in particular the version of 'third party' software (e.g., ROOT). The remote servers should be able to handle any versions of software used by the client and should not be tied to one software version on startup.
Scenarios
Ad Hoc Computing
The default configuration should allow a graduate student to start AE workers on a few systems and have it do something useful with a minimum of configuration. AE workers should be discoverable via
ZeroConf or a similar discovery protocol. Once discovered, supported scenarios should include:
- All discovered workers join a designate (or default) group with no authentication. This mode is only safe in private sandboxes.
- Workers and master(s) mutually authenticate, for example via a pre-shared public/private X.509 key pair. It should be possible to configure workers into groups working on different tasks.
- For interactive browsing, each worker should generate its own public/private key pair stored on stable storage, which is used to self announce its availability for ssh-style "leap of faith" authentication where a worker accepted once interactively is trusted so long as it announces itself with the same key pair.
Managed Systems
The primary managed systems scenarios are execution in mixed-use batch queues, and execution on dedicated systems. For the batch queue case, discovery and authentication could proceed as in the ad hoc scenarios, or there could be a pre-configured master to contact, in which case the workers will not make themselves discoverable. For dedicated systems, there should be a helper daemon (which should also be usable from inetd) which deals with initial connection establishment and authentication, and then forks the worker. The functionality requirements for the helper daemon are very similar to sshd.
--
ChrisDJones - 27 Oct 2006
--
DanRiley - 25 Oct 2006