Overview

From GCube System
Revision as of 14:52, 21 January 2009 by Fabio.simeoni (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Here is a quick tour of the main functional areas in which gCF lends a hand (or two). We discuss each topic in detail in the rest of the Guide.

Lifecycle Management

The RIs of gCube Services ought to behave like finite state machines, transitioning on key events from deployment and initialisation to activation and failure.

Whilst in some states, RIs ought to engage in specific activities (e.g. perform initialisation after successful deployment). In others, they ought to refuse any external engagement and wait patiently for the occurrence of external events. Some of these events may be related to the completion of previous activities (e.g. successful initialisation for operation in an unsecure infrastructure). Others depend on external stimuli (e.g. the arrival of valid credentials for autonomic operation in a secure infrastructure). Yet other events are associated with the failure of other transitions (e.g. gHN bootstrapping failures). Finally, some transitions ought to be monitored and published within the infrastructure (e.g. activation or failure) while others are instead to be kept private (e.g. initialisation).

gCF manages the entire lifetime of gCube services transparently, leaving gCube developers with the sole responsibility of communicating failure whenever they observe it. At the same time, it allows developers to customize service behavior on each transition in accordance with service-specific semantics.

Scope Management

All gCube Resources - gHNs, services, RIs, content and metadata collections, etc. - are associated with one or more scopes that determine their visibility and/or usability within a gCube infrastructure. RIs, in particular, may operate in different scopes at different times and clients must specify a scope when contacting them. Similarly, any state that may be associated with RIs must retain the scope of the call which triggered its creation. Scope is in itself a multi-faceted notion as it may encompass the whole infrastructure, a single Virtual Organisation, or a single Virtual Research Environment.

Overall, scope management must enforce a complex set of scoping rules that vary across types of resources and reflect semantic relationships among resources. For example, the scope of a RI is subordinate to that the gHN in which it is deployed. Similarly, the scope of the state of a RI is subordinate to the scope of the RI itself. State that is reused across different scopes should be published in all of them; however, it should only be unpublished but not altogether removed when it is dismissed in some but not all of its scopes. Calls should be rejected if they are incompatible with the scope of the target RIs but their scope should propagate from RI to RI in a distributed workflow and from thread to thread within the runtime of a single RI.

gCF handles transparently scope bindings and scope rules enforcement. It also offers abstractions to simplify the propagation of scope within single runtimes (scope managers) and across different runtimes (stub proxies).

Security Management

RIs that operate within secure infrastructures may need to ascertain the identity and privileges of their clients before accepting to process their requests. If such processing requires interactions with other RIs, then they may need to act assume the identity and privileges of their clients, while still complying with the security requirements defined by the target RIs. RIs may also need to act autonomically and thus interact with Security Services for the periodic renewal and localisation of their own credentials. And beyond authentication and authorisation, RIs may expect certain levels of privacy and integrity from interactions.

gCF handles transparently the security requirements of RIs that operate in a secure infrastructure, whether they need to use their own credentials and/or those of their clients. Developers specify what type of credentials ought to be used to process a given request and service-wide security managers make them asynchronously available at the points in which they are needed, i.e. before making outgoing calls. Like for scoping, stub proxies interact autonomously with security managers to set credentials, privacy, and integrity settings on outgoing calls, in accordance with default requirements of the target instances. Furthermore, all security-related actions undertaken by framework and developers are silently disabled if the service instance happens to operate in an unsecured infrastructure. This makes service implementations independent from the security settings of the environment in which some of its RIs might happen to operate.

State Management

RIs may engage in ‘stateful’ interactions with their clients, in that their responses are not solely a function of requests but rely on some notion of application-level state that pre-exists, persists, and may evolve as a result of those interactions.

State that needs to be consumed by different clients, at different times, and for different purposes requires system-wide mechanisms for its publication and discovery. Based on these, systemic solutions for state management (e.g. scheduled destruction) and state change (subscriptions and notifications) can then be implemented. Standard publication and discovery mechanisms, on the other hand, can only be built on standards for modeling state and access to state. It is thus a systemic requirement in gCube that the design of stateful services follow the Web Service Resource Framework (WSRF) set of standards for state management.

In particular, stateful services ought to model state as one or more WS-Resources, i.e. first-class Web Resources which:

  • aggregate local implementations of state - the stateful resources - with some port-type through which they can be publicly accessed;
  • do so transparently, in the sense that port-types and stateful resources can be unambiguously identified from the WS-Resources which aggregate them. Precisely, WS-Resources are identified by a combination of the endpoint of their port-type and the identifier of their stateful resource (based on conventional use of WS-Addressing standards). WS-Resource identifiers can then be used to access their stateful resource through their port-type (implied resource pattern);
  • publish the type of distinguished properties of their stateful resources in the interface of their port-types (the Resource Property Document). These include properties which are specific to the semantics of the service and properties which support system-level management of the WS-Resource;
  • extend their port-type to implement standard interfaces for the consumption of Resource Properties, including interfaces for state publication and discovery (WS-ResourceProperties), for state life-time management (WS-ResourceLifetime), and for state change subscription and notifications (WS-BaseNotification);

Finally, many stateful services may need to recover their state after node failures or node relocations. These services ought then to persist their state, both locally (to recover from failures) and remotely (to recover from relocations).

gCF simplifies adoption and compliance with WSRF standards, in two important directions. First, it implements the interfaces and predefines properties that are common to all WS-Resources, either transparently to the developer or else by means of simple configuration.

Second, it offers a comprehensive set of base abstractions for modeling stateful resources, which reflect and support common design patterns (e.g. singleton resource, multiple resources, multiple resource views over shared resources). Across all such patterns, the pre-defined abstractions factor out the basic behavior and inter-dependencies which - regardless of service-specific semantics - may be found across the processes of creation, initialisation, destruction, and persistence of stateful resources. Service-specific semantics can then be injected by extending or customizing the abstractions at the fine level of granularity that is normally associated with the use of the template pattern (callbacks).

Fault Management

Service instances throw faults to signal the occurrence of unpredictable circumstances within the runtime environment, which force deviations from the control flow expected by the implementation of the service. The intention is to characterize problems for clients in the assumption that, based on the characterization, they may react in some useful way.

gCube services ought to be designed so as to return three broad types of faults, depending on whether the fault is deemed to be unavoidable across any instance of the service (unrecoverable), perhaps avoidable at some instance other than the one which observes it (retry-equivalent), or perhaps avoidable in the future at the same instance which observes it (retry-same).

A resilient client may then exploit these broad semantics to react in ways which are perhaps more useful than by gracefully desisting. In particular, a client which is presented with either one of the last two types of fault can try to recover accordingly - by trying with other instances or by retrying with the same instance but at later time. Similarly, a client that is presented with an unrecoverable fault may avoid consuming further resources in the attempt.

gCF offers a number of facilities for dealing with the three types of gCube faults. First, it predefines their type declarations for immediate importing within service interfaces.

Second, it provides default implementations of these interfaces which:

  • serialize and deserialize on the wire in accordance with gCube requirements;
  • need not be included in stub distributions;
  • can be handled as normal exceptions within the code (e.g. may wrap other exceptions and be caught in try/catch blocks).

Third, it mirrors faults with lightweight exceptions for convenience of use within the service implementation: as faults and exceptions are freely convertible, a service may be designed to:

  • convert exceptions into corresponding faults as these exit the scope of the service
  • convert faults received from remote services into the corresponding exceptions and let these percolate up the call stack.

While the transparencies discussed so far relate to the fulfillment of system requirements, the framework offers also tools that may support developers in the implementation of service-specific semantics, as shown below.

Configuration Management

As most open technologies, those associated with Web Service development rely heavily on access to configuration resources. In Java, these include both classpath and file system resources, and may vary in their format from property files, to XML documents, to schemas for those documents. Most importantly, they tend to cover virtually all aspects of service development: from building and deployment, to logical and physical specification, to service initialization and state publication, to logging and security policies.

Under assumption of dynamic service deployment, however, programmatic access to file system resources may not rely on absolute knowledge of their location. Rather, it ought to occur relatively to key locations that are dynamically resolved with respect to the gHNs on which service instances are actually deployed. Further, access to file system resource ought to be resilient to I/O failures and other forms of corruption.

Finally, access to configuration resources ought to be available not only for introspection of service instances, but also for exploration and discovery of the runtime environment in which they operate. Of particular interest here may be the configuration of the gHN in which the instances are deployed, and in some advanced scenarios also the (public) configuration of co-deployed service instances.

Based on these requirements, the gCF offers ground facilities to access in read or write mode arbitrary configuration resources on both classpath and file system. Write operations on the file system are implicitly associated with backup operations and, in the case of failure, read operations are transparently redirected to the available backups.

Built on top of these facilities, standard configuration resources which govern building, deployment, security, publication, and similar cross-service processes are pre-parsed and exposed through dedicate channels and object bindings for convenience of inspection. Similar channels and bindings are available for access to gHN configuration. In addition, high-level access to local directory services (JNDI) extends the convenience of object bindings to idiosyncratic configuration elements, at little or no extra cost. Finally, access to the configuration of different run-time entities - e.g. gHN configuration vs. service-wide configuration vs. port-type specific configuration - is rationalized and distributed along a hierarchy of corresponding context objects.

Discovery & Publication

Interacting with instances of other services is one of the tasks that all but the simplest gCube services and clients need to engage with. Discovering such instances by characterization of their own properties, or else those of the state they may produce and/or handle, is normally a prerequisite for the required interactions. Put another way, interacting with instances of any given service requires interacting previously with instances of services in the Information System.

gCF embeds a client library (more precisely the interface and the reference implementation of such library) for querying the Information Service, which simplifies the formulation of queries and the processing of results.

Transparencies vary across classes of queries in reflection of different trade-offs between generality and simplicity. Free-form queries admit arbitrary query expressions and return lists of XML results wrapped inside XPath engines. Named queries sacrifice the full generality of free-form queries but reduce their formulation to a simpler case of template instantiation. Resource queries specialize further to return gCube Resources but offer flexibility in the specification of filters and the convenience of optimal object bindings for the results. WS-Resource queries offer similar advantages for another large class of common queries, those on WS-Resources published by stateful service instances. Finally, all types of queries can be executed by value - the execution returns with the totality of results - or by references - results are embedded in a lazily produced and lazily consumed result set.

The counterpart of discovery is publication and the framework offers a dedicated client library also for publishing gCube resources and WS-Resources with the Information Service. The library is used extensively and periodically by gCF itself, to publish information about the node and deployed instances, as well as by other key Enabling services. It is otherwise unlikely to be directly used by many gCube services, given that publication of gCube resources and WS-Resources which relate to individual services are carried out by gCF, either transparently to their developers or else by means of simple configuration.

Event Management

Service and client implementations that spread over a large number of components benefit from drawing loosely coupled connections between them. Asynchronous communication based on the exchange of events - possibly through third-party mediation - is a well-known approach to loosely coupled interactions which proves useful both across multiple runtimes and within single runtime.

gCF includes a small number of interrelated abstractions that model the prototypical actors of a local, event-based model of communication: producers accept subscriptions from consumers and notify them of the occurrence of past and future events about topics for which they subscribed.

Producers, consumers, and events are parametric in the type of topics and event payloads, and can thus be instantiated more or less concretely to achieve the desired compromise between static type checking and flexibility. Focusing on producers, for example, some of them may handle events about a given type of topic but otherwise carrying arbitrary payloads. Others may allow events about any topic as long as they carry a given type of payloads. Yet others may statically refuse to handler any event that is not about a given topic and carries a given payload.

Events and producers are partially implemented for convenience and to ensure correctness: developers extend the former and may extend or encapsulate the latter. Topics and consumers are instead left unimplemented for flexibility: developers may enumerate topics for maximum simplicity or else organize them in arbitrarily complex hierarchies for flexibility; consumers may then be bound to individual types of topics or to roots of arbitrarily deep hierarchies. In the latter case event consumption entails a type analysis on the events in input and possibly a dispatch to event-specific methods. Producers may then simplify the task of consumers by presenting them with partial implementations that perform potentially complex type-analysis but dispatch to event-specific methods that are left unimplemented.

Similar patterns are heavily used within the gCF itself to synchronise service implementations that are deployed on the same gHN and yet are unknown to each other. Through events, in particular, all service implementations synchronize transparently with key local services that are responsible for issues as critical as the management of their credentials and their lifetime.

Local Processes & Remote Interactions

From a sufficiently high level perspective, service and client implementations may be thought of as coordinating a number of local processes. Some processes may compose in chains, others may unfold in parallel, and yet others may execute periodically. Many will of course engage in interactions with remote service instances.

If such processes were given dedicated object models with standard interfaces, then the most general (and tedious, error-prone, or downright complicated) aspects of their coordination - i.e. how to chain them, parallelize them, and schedule them - could be factored out and reused across implementations.

Similarly, similarities between families of related processes - e.g. the steps which are common to any remote interaction, or indeed those common to all the interactions with instances of the same service - could be conveniently shared through conventional object-oriented mechanisms, i.e. by organizing implementations in inheritance hierarchies.

These observations justify one of the most versatile tools of the framework, namely a small set of interrelated abstractions used to explicitly model local processes. A handler is a stateful processor that acts upon, or on behalf of, some target object. A few key handlers are complex, in that their target is a list of ‘component’ handlers:

  • sequential handlers execute their components in chains, propagating the state of each component forward and echoing failures backward;
  • parallel handlers execute their components concurrently and apply strict or lax policies for handling their failures;
  • scheduled handlers have a single component and repeat its execution at fixed intervals, stopping either when explicitly told or at the occurrence of some customizable condition, e.g. the amount of failures of its component.

Other pre-defined handlers specialize in best-effort interactions with remote services on behalf of the target service:

  • service handlers query the Information Service to find service instances, engage with each of them in turn until successful, and mark instances which result in successful interactions in order to engage them first in the future. In the process, service handlers make full use of gCube fault semantics, aborting in the face of unrecoverable faults and queuing instances which are not yet ready;
  • staging handlers extend the best-effort strategy of service handlers to the case of WS-Resources and, if none can be found in the infrastructure, try to create them through factory port-types of the service instances, again on a best-effort basis.

Developers define their own handlers against this landscape of predefined ones. For example, they may use them as the components of complex handlers with the intention to coordinate them in some combination of sequential, parallel and scheduled execution (e.g. to parallelize two chains of processes every three minutes). Alternatively, they may extend their handlers from service and staging handlers so as to inherit their best-effort strategy and specialize it to the queries and port-types that are specific to their interactions. Families of interactions with instances of a single service could then be conveniently organised in inheritance hierarchies. Equally, they could combine complex handlers and service handlers, e.g. to poll the Information Service at regular intervals, to generate concurrently WS-resources of the same service, or to accomplish a complex task which requires a sequence of interactions which instances of different services.

Architecture Overview

We present here the overall architecture of the framework. Being an application framework, gCF components are packages of classes that – individually or in collaboration – implement the functionalities summarised and motivated in this section. We shed light on the contents of individual packages in the rest of the Guide.

Architecture.jpg