# $Id: README,v 1.4 2008/12/03 16:13:53 ksb Exp $ # $Source: /usr/msrc/usr/local/libexec/vcsstats/RCS/README,v $ Description =========== We want to remember which host owns a VCS service group over time, both to see the current state without a login to the cluster and to associate system statistics with the correct service groups in the past. Redundancy and minimal configuration are desired. Supplemental statistics such as number and type of log messages for a group and the time a group came online are also useful. Given the infrequency of change in the VCS engine, a granularity of an hour in data samples is sufficient. To achieve this, vcsstats, when run, examines the current system and reports on any groups that are online or partially online that system. In most cases, no specialized configuration for vcsstats is necessary; running the command with no options will cause it to read the system adn group configuration from VCS. As group moved from system to system, different hosts will report on that group; a system that owns no groups will generate no reports. This method provides redundancy, as the responsability for reporting for a group moves with the group during failover. The granularity of the RRD generated by the associated CVU is one hour. Oversampling roughly by the natural number, vcsstats should be scheduled to run once every 20 minutes. This means changes to group state might only be visible after a latency of 50 minutes. Note that it is possible that the system name VCS knows the host by may not match the operating system's hostname. Some assumtions are made about the VCS configuration: 1) The name of each system in the VCS cluster matches the unqualified hostname of each host. If this is not the case, either the "-a" or "-s" options must be used, otherwise no groups will ever be found to be running on the current host. 2) Each system name shares the same alphabetical base and are differentiated only by their unique numerical suffix. RRDs can only recieve numerical data, so a numerical suffix is used to identify the system in the update to PEG. If the systems do not have unique numerical suffixes, "-l" must be specified to read a numerical identifier for each system out of /etc/llthosts. It is strongly reccomended, when setting up a VCS cluster, that each VCS system name always match the unqualified hostname and that each hostname share the same alphabetial base followed by a unique numerical suffix. Usage ============ vcsstats [-x] [-N host] [-t target] [-a | -s system] [group] ... vcsstats -h vcsstats -V a Report on service groups running on all systems in the cluster. With this, one system in the cluster can send updates for all service groups. This is not reccomended, as it does not provide adequite redundancy by itself. l Read the unique numerical mapping for each system from /etc/llthosts rather than deriving it from a numerical suffix on the system name. This should always work as such a mapping must exist for VCS to work. However, the mapping may not be intuative to users as it is only meant used internally by LLT and GAB. N host Send reports to PEG as a particular hostname. This is useful if the hostname of the current host is wrong or unqualified. s system Report on groups on a particular system. This matches the system name within the VCS configuration, not the hostname. If left unspecified, the unqualified hostname of the current host will be used as the system name. t target Address and port of an PEG listerner to send updates to. x Trace the data from the RRD updates to standard error. h Print a usage message. V Show version info and configuration. group Report only on specific service groups. The groups must be owned by one of the selected systems or no RRD update for those groups will be generated. Dependencies and Environment ============ This tool has been tested with Veritas Cluster Server 4.1. Note that vcsstats will augments its PATH with /usr/local/bin automatically to find these tools. Any one of these default can be overriden by setting the indicated environment variable. op vcs-display grp VS_DGRPS_COMM op vcs-display res VS_DRES_COMM vcsstats must be able to read the output of "hagrp -display" and "hares -display", as these commands require root privleges and vcsstats should not be run as root, vcsstats defaults to use of the op rule "vcs-display" to retrieve this information. since VS_SINCE_COMM The log file is read via the since tool so that only additions since the last run are processed. rrdup VS_RRDUP_COMM Updates to peg are formatted by rrdup which will take the responsability of sending the update over the network. /var/VRTSvcs/log/engine_A.log VS_ENGINE_LOG Events that are specific to a running group will be counted and associated with the system that generated. A tally of each message type from the logs is sent in the update. Also, the time a group came fully online is determined from this log. /etc/llthosts VS_LLTHOSTS This is used only with the "-l" option when a unique numerical identifier for a host can not be determined from the system name. $HOME/.vcsstats VS_STATE_DIR Any persistent state needed by vcsstats wull be kept in this directory that is automatically created when the tool is run. $VS_STATE_DIR/since VS_SINCE_DB A dedicated since database file will be used when reading logs. $VS_STATE_DIR/state VS_STATE_FILE Certain data that can not be easily determined with each run of the tool, such as online time, will is kept here. This file is evaled directly into the running vcsstats. Bugs ============ The "since" operation is not atomic with the RRD update. While effort is made to minimize the risk, it is possible for vcsstats to fail for some reason after since has updated the timestamp in its database, but before the update is sent. Also, because the update is a single UDP datagram, it may be dropped from the network. In either case, data from the VCS log will be lost. Killing a running vcsstats may cause a disconnection warning in the VCS engine log that will be picked up by monitoring tools such as logsurfer or OVO. Such a warning is innocuous and may be ignored. -- jad, Feb 2008