Administrator Guide

From GCube System
Revision as of 13:40, 26 July 2011 by Gabriele.giammatteo (Talk | contribs) (gHN Yum Repositories)

Jump to: navigation, search

This Guide covers the installation, configuration, and maintenance of gCore.

Prerequisites

The following are prerequisite for the installation of gCore:

  • J2SE 1.6 update 4 SDK or greater. Sun's reference implementation is recommended, but versions from IBM, HP, or BEA should work equally well.
  • Ant 1.6.5+ to build gCF sources or to develop services with it.
  • an SVN client to install gCore from the SVN repository.
  • GNU tar to install gCore from archived distributions.
  • sudo privileges on the shell.

The following are pre-requisites for the operation of gHN in any infrastructure:

  • A static IP address and preferably a DNS name.

The following are pre-requisites for the operation of gHN in a secure infrastructure:

  • A ntp server to synchronise the machine's clock for correct credential validation.
  • A host certificate and private key (owned by the user that runs the container) respectively in:
 /etc/grid-security/hostpubliccert.pem (please check that the certificate file has -rw-r--r-- permissions)
 /etc/grid-security/hostprivatekey.pem (please check that the private key file has -r-------- permissions).

Installation

Once downloaded, gCore can be installed in a directory of choice (the gCore location). In either case, proceed to the installation as a a non-privileged user with read and write permissions on the gCore location. Due to some technical constraints, the current version of gCore requires that different installations must run under different users, i.e. the same user cannot configure and execute more than one container.

The structure of the installation is the following:

|-bin
|
|-config
|
|-endorsed
|
|-etc
|
|-lib
|
|-libexec
|
|-logs
|
|-share

Some folders are of immediate interest to administrators and developers alike:

bin executables.
config gHN configuration files.
etc configuration files of container's and deployed service.
lib standard and deployed service libraries.
logs Log files for gHN, Local Services, and legacy technologies.
share build tools, standard and deployed service interfaces and schemas.


gHN Yum Repositories

Starting from gcube 2.5.0, the gHN container is distributed, along with the usual tar.gz format, in rpm packages in order to facilitate the installation on platforms that supports rpm packaging system. For each release a YUM repository is created containing all rpm packages needed to successfully install a gHN node.

Targeted platforms are, for the moment, CERN Scientific Linux 4 32/64 bits and CERN Scientific Linux 5 32/64 bits.

Installation procedure:

  • go to the distribution site and download the .repo file for the gHN version.
cd /etc/yum.repos.d
wget <DISTRIBUTION_SITE_LINK>
  • the content of the downloaded file should be similar to this:
[ghn2-3-0]
name=D4Science gHN version 2-3-0
baseurl=http://etics-repository.cern.ch/repository/pm/volatile/repomd/id/064dc458-c5b5-4836-8d20-b0baaedf46ff/sl5_x86_64_gcc412
enabled=1
  • to install the gHN software, issue:
yum install org.gcube.distribution.ghn-rpm-distribution --nogpgcheck

Third-party software

gCore ships a number of third party products. The source material is copyright of the original publishers and software are governed by the terms and conditions of the third-party software.

Here it is a complete list grouped by provider.

Apache Software Foundation (AFS) ANT

  • ant-launcher 1.6.5
  • ant 1.6.5
  • antlr 2.7.6

ASF AXIS

  • addressing 1.0
  • axis 1.2RC (globus patched)
  • saaj 1.2RC
  • jaxrpc 1.2RC
  • axis-url 1.2.6
  • wsdl4j 1.2RC

ASF XML

  • resolver 1.1.1
  • xercesImpl 2.6.2
  • xml-apis 2.6.2
  • xmlsec 1.2.1
  • xalan 2.6

ASF COMMONS

  • commons-beanutils 1.6.1
  • commons-cli 2.0
  • commons-collections 3.0
  • commons-digester 1.2
  • commons-discovery 0.2dev
  • commons-io 1.2
  • commons-lang 2.4
  • commons-logging 1.1.1

Tomcat 4.1

  • naming-java 4.1
  • naming-resources 4.1
  • naming-factory 4.1
  • naming-common 4.1

GLOBUS 4.0.x

  • cog-axis
  • cog-jglobus
  • cog-tomcat
  • cog-url
  • puretls 0.9b4
  • cryptix-asn1 ?
  • cryptix.jar ?
  • cryptix32 3.2.0
  • bootstrap ?,
  • globus_usage_core
  • globus_usage_packets_common
  • globus_wsrf_mds_aggregator
  • globus_wsrf_mds_aggregator_stubs
  • globus_wsrf_servicegroup
  • globus_wsrf_servicegroup_stubs,
  • wsrf_common
  • wsrf_core
  • wsrf_mds_index_stubs
  • wsrf_mds_usefulrp
  • wsrf_test
  • wsrf_tools
  • wsrf_mds_usefulrp_schema_stubs
  • wsrf_provider_jce
  • wsrf_core_stubs
  • wsrf_mds_index

gLite

  • glite-security-util-java 1.3.4

MISC

  • cglib 2.2
  • objenesis 1.1
  • bcprov-jdk14 1.2.2
  • jce-jdk13 1.2.5
  • concurrent ?
  • SUN servlet.jar 2.3/1.2(JSP)
  • opensaml 1.0.1 (globus patched)
  • kxml2 2.3.0
  • log4j 1.2.15
  • jgss ?
  • junit 3.8.1
  • wss4j ?
  • SUN jsr173_api ?
  • BEA commonj 1.1
  • Jaxen XPath library - jaxen-1.1-beta-9.jar

Configuration

Configuring the installation can be roughly distributed across the following steps: configuring the environment, the container, the gHN associated with a running instance of the container, and the operation of the gHN in a secure infrastructure.

Configuring the Environment

  • Define an environment variable GLOBUS_LOCATION and point it to the gCore location. Assuming a bash shell:
export GLOBUS_LOCATION=absolute path to your gCore location
  • (optional) Add $GLOBUS_LOCATION/bin to the value of your PATH environment variable.:
export PATH=$PATH:$GLOBUS_LOCATION/bin
  • (optional) If building gCF-compliant services, define an environment variable BUILD_LOCATION and set it to the location from which ant will be invoked and where temporary build structures and artefacts will be located:
export BUILD_LOCATION=absolute path to your build location

Configuring the Container

Specify the hostname of your machine as the value of logicalHost parameter in the container's configuration file $GLOBUS_LOCATION/etc/globus_wsrf_core/server-config.wsdd:

<parameter name="logicalHost" value="..yourhostname..."/>

In the default configuration, the container typically allocates 1GB of heap space to the JVM in which it runs. This is a production-level requirement and can be increased by setting new parameters in the $GCORE_START_OPTIONS variable, either by editing the script $GLOBUS_LOCATION/bin/gcore-start-container or by adding it to the execution environment. If one wants to decrease the memory used, it is needed to edit the script, since, due to the Java Virtual Machine behavior, in case of duplicated setting, the higher setting is considered.

Moreover, any setting reported in the $GCORE_START_OPTIONS variable is passed to the container process and evaluated by the JVM.

Configuring the gHN

The configuration of the gHN that relates to its operation within the infrastructure and can be found in $GLOBUS_LOCATION/config/GHNConfig.xml. The file $GLOBUS_LOCATION/config/GHNConfig.client.xml can be used to dedicate a separate configuration to a gHN that operates in client mode.

The following gHN properties are available for configuration:

securityenabled true if the gHN can operate in a secure infrastructure, false otherwise.
mode either CONNECTED or STANDALONE depending on whether the gHN does or does not publish information in the infrastructure.
infrastructure the name of the infrastructure in which the gHN operates. (e.g. gcube, d4science,...).
startScopes a comma-separated list of VOs that the gHN joins.
allowedScopes a comma-separated list of VOs that the gHN will potentially join (upon VO Manager decision).
labels the name of the file that includes custom labels to characterize the gHN. These are added to those automatically derived by gCore and published in the gHN profile. The file name must be relative to the $GLOBUS_LOCATION/config directory.
GHNtype either DYNAMIC or STATIC depending on whether the gHN can or cannot be used as a target for dynamic deployment operations.
coordinates a pair of comma-separated values for the latitude and longitude of the gHN. Coordinates for some popular locations are available here.
country the two-character ISO code of the Country where the gHN is located.
location the name of the location.
publishedHost the hostname to declare in the GHN and Running Instance profiles, if different from the actual one
publishedPort the port to declare in the GHN and Running Instance profiles, if different from the actual one
updateInterval how often the gHN must has to refresh its profile on the IS (in seconds).
portRange [optional] a dash-separated pair of numbers that identify a range of free ports, if any.
testInterval [optional] how often the monitoring Probes have to perform local test on the gHN

For example, the configuration required to join the gHN to the /gcube/devsec and /gcube/testing VOs is the following:

 infrastructure = gcube 
 startScopes = devsec,testing

For an in-depth coverage of scope and scope-related parameters (infrastructure and startScopes) see the Developer Guide.

Configuring Logging

A running gCore container will produce extensive logs in accordance with the log4j configuration directives container in $GLOBUS_LOCATION/container-log4j.properties. By default, the container logs in a file called $GLOBUS_LOCATION/logs/container.fulllog with a TRACE level for all the gCF components, and in $GLOBUS_LOCATION/logs/container.log with a INFO level for all the gCF components. Local Services have also dedicated file loggers.

Configure container security

  • Set the Security Descriptor of the underlying container in $GLOBUS_LOCATION/etc/globus_wsrf_core/global_security_descriptor.xml, using this example as a guide.
Note: Please include the configuration <context-timer-interval value="300000"/> to ease the effect of a well-known bug with the underlying technologies (as in the example above).
  • Add the following configuration in the <code><globalConfiguration> section of $GLOBUS_LOCATION/etc/globus_wsrf_core/server-config.wsdd:
<parameter name="containerSecDesc" value="etc/globus_wsrf_core/global_security_descriptor.xml"/>

Configure VOMS credentials

VOMS credentials must be installed in the local system to verify VOMS assertions. To do this:

  • Copy the certificates of trusted VOMS servers in $GLOBUS_LOCATION/etc/grid-security/vomsdir.
Note: Please check that certificate files have -rw-r--r-- permissions.
  • Create VOMS files in /opt/glite/etc/vomses using the following conventions:
  • file naming convention:
 <VO Name>-<VOMS SERVICE HOSTNAME>
  • content convention:
 "<VO Name>" "<VOMS SERVICE HOSTNAME>" "<VOMS SERVICE PORT>" "<DN of VOMS CERTIFICATE>" "<VO LOCAL NAME>"
Example: "devsec" :grids01.gcore.org" "15001" "/C=IT/O=INFN/OU=Host/L=GCORE/CN=grids01.gcore.org" "devsec"
Note: The VO name devsec should be associated to the VOMS service running on grids01.gcore.org. This will assure to properly validate assertions contained in proxy credentials.

Install voms-proxy-init command for local testing (optional)

  • Install rpms in the order in which they appear in the download page.
  • Copy the configuration file to the directory /etc/glite/profile.d/
  • Modify the configuration file in accordance with with the local values of the environment variables JAVA_HOME and GLOBUS_LOCATION.

Verify the Installation

To verify the installation, start the container with the script $GLOBUS_LOCATION/bin/gcore-start-container. Assuming PATH is set as recommended above:

gcore-start-container

will suffice. Any instance of the container which is already running should be automatically kill-ed and the new instance should log the list of deployed services in $GLOBUS_LOCATION/nohup.out <code> and detailed information about the startup of local services in <code>$GLOBUS_LOCATION/logs/container.log. Lack of visible errors in both files indicates a successful gCore installation and startup.

Note:By default, the command above starts the container on the port 8080. To switch on another port, use the -p <port> option.


Troubleshooting

gHN with a high number of invocations

If a gHN hosts a service instance subjects to an high number of invocations, it might happen that the gContainer stops to respond to the callers' requests. This is due to mistaken usage of the "id" command of the underlying technology. Soon, the process will fall into a "too many files open" state and it can basically do nothing.

When such a condition is predictable, we suggest to create a temporary folder for the system commands and remove from there the "id" command.

The complete workaround is the following:

  • create a folder where to collect the system commands:
mkdir fakebin
cd fakebin
find /usr/bin -type f -exec ln  {}  \;
find /usr/bin -type l -exec cp -a {} . \;
rm id
  • remove the /usr/bin folder from the PATH
  • add the new fakebin folder to the PATH

Of course, this could have impact on the other processes, therefore the new patched environment must be used only for the gContainer process.