Introduction¶
This document describes the Nagios plugins mainly used to monitor NorduGrid ARC compute elements and related resources, but some probes should also be usable to test non-ARC resources. The package includes commands to do
LDAP queries and tests on the information system, including GLUE 2.0 and legacy schemas.
Job submission and monitoring of jobs with additional custom checks.
Transfers to and from storage elements using various protocols.
The following chapters will cover the probes related to each of these topics. This chapter will describe common configuration and options.
Acknowledgements. This work is co-funded by the EC EMI project under the FP7 Collaborative Projects Grant Agreement Nr. INFSO-RI-261611.
Configuration Files¶
The configuration is merged from a list of the INI-format files, where
settings from later files take precedence. By default files matching
/etc/nagios/*.ini
are read in lexicographical order, but this can be
overridden by setting $ARCNAGIOS_CONFIG
to a colon-separated list of the
files to load. A naming scheme like the following is suggested:
/etc/arc/nagios/20-dist.ini - comes with the default package
/etc/arc/nagios/60-egi.ini - comes with the EGI package
/etc/arc/nagios/90-local.ini - suggested for local changes
An alternative to /etc/arc/nagios
can be specified in the environment
variable $ARCNAGIOS_CONFIG_DIR
.
Under the same prefix, a default job script template is installed:
/etc/arc/nagios/20-dist.d/default.xrsl.j2
You can provide a modified script by placing it e.g. in
/etc/arc/nagios/90-dist.d/default.xrsl.j2
, but be careful with this in
production environment since later versions of the probes may require changes
to the script which makes the modified version incompatible.
Each probe has a main configuration section named after the probe or
colloquially [arcce]
for the check_arcce_*
probes. In this section
you can provide defaults for string-valued command-line options. The name of
the configuration variable corresponding to an option is obtained by stripping
the initial “--
” and replacing “-
” with “_
”, e.g.
“--home-dir
” becomes “home_dir
”.
Common Options¶
The following options are common to most of the probes:
--home-dir=<dir>
Override $HOME at startup. This is a workaround for external commands which store things under $HOME on systems where the user account running Nagios does not have an appropriate or writable home directory.
--loglevel=(debug|info|warning|error)
This option allows you to increase the verbosity of the Nagios probes. Additional messages will occur as extended status lines in Nagios.
--multiline-separator=<chars>
Replacement for newlines when submitting multi-line results to passive services. Pass the empty string drop extra lines. This option exists because Nagios currently don’t support multi-line passive results.
--command-file=<path>
The path of the Nagios command file. By default $NAGIOS_COMMANDFILE is used, which is usually the right thing.
--how-invoked=(nagios|manual)
,--dump-options
These are only needed for debugging purposes.
--arcnagios-spooldir
Top level directory for storing state information and for use as a working area. The default is
/var/spool/arc/nagios
. If you need to debug an issue related to CE jobs, look under thece-*
subdirectories.
Proxy Certificate¶
The check_arcce_*
and check_gridstorage
probes will require a proxy
certificate to succeed. The probes will maintain a proxy when provided a X509
certificate and key. You can place these in a common section:
[gridproxy]
default_voms = <voms>
user_key = <path>
user_cert = <path>
#user_proxy = <path> # Optionally override the path of the generated proxy.
The probes which require an X509 proxy have a --voms=<voms>
option to
specify the VOMS server to contact instead of default_voms
. When a
user_key
and user_cert
pair is given, the default user_proxy
path
is unique to the selected VOMS.
To use a pre-initialized proxy, make sure user_key
and user_cert
are
not set. You will probably want to use a non-default location for the
proxy. Either point to it with the environment variable X509_USER_PROXY
or set it in the configuration file:
[gridproxy]
user_proxy = <path>
If you use several VOs with require different certificates, you can replace
the above section with one section gridproxy.<voms>
per <voms>
and use
the --voms
option to select which section to use. These sections don’t
have the default_voms
setting.
Running Probes from the Command-Line¶
The following instructions apply to check_arcce_submit
,
check_arcce_monitor
, check_arcce_clean
, and check_gridstorage
.
The other probes can be invoked from the command-line without special
attention.
For testing and debugging, it can be convenient to invoke the probes manually
as a regular user. This can be done as follows. Choose a directory where you
can store run-time state. Below, we use /tmp
, but it may be tidier to
create a fresh directory. Then, create a configuration like
[DEFAULT]
arcnagios_spooldir = /tmp/arc-nagios-testing
[gridproxy]
default_voms = <your-vo>
[gridproxy.your-vo]
user_proxy = /tmp/x509up_u<your-user-id>
substituting suitable values for the <your-*>
meta-variables. You may
need to add additional settings depending on want you test, of course.
Acquire a proxy certificate (if needed) and pointing to the set of
configurations you need, including the above:
arcproxy -S <your-vo>
export ARCNAGIOS_CONFIG=/etc/arc/nagios/20-dist.ini:<your-config>
The probes can now be run as
check_arcce_submit --how-invoked=manual ...
check_arcce_monitor --how-invoked=manual ...
check_arcce_clean --how-invoked=manual ...
The main purpose of the --how-invoked=manual
is to tell the probe that any
passives results shall be printed to the screen rather than submitted to the
Nagios command pipe. It is not strictly needed for active-only probes.