Polled Data – Nagios, JMX, Windows Performance Counters

Table of Contents

Overview

AppFirst has made it easy to integrate popular Polled Data sources into our platform. Right now we support three Polled Data sources: Nagios Plugins, our own custom-written JMX Plugin, and Windows Performance Counters.

Configuring Polled Data for a Server

Navigate to Admin – Setup and click the “Polled Data” tab to configure Polled Data. This process is the same for Nagios, JMX, and Windows Performance Counters.

Configure Polled Data

Click on a server to open its Polled Data Config File. This is where you add any Polled Data scripts, whether they’re Nagios, JMX, or Windows Performance Counters. See below for examples of what Polled Data Config Files look like.

Polled Data Deletion

In order to delete polled data a user will go through the following commands.

  • Go to Admin -> Setup -> Polled Data
  • Select Manage Polled Data (found to the top right of the polled data table)
  • Select the polled data items you would like to delete
  • Once all data has been selected click the Delete selected in the upper left of the page. A message will appear asking “Are you sure?” click Yes and the selected Polled Data will be deleted
  • Nagios Plugins

    2

    JMX

    The lines that start with jmx_command are for JMX.

    Configure JMX

    Windows Performance Counters

    Configure Windows Performance Counters

    *Note: After you add Polled Data in the Polled Data Config File in the user interface, the collector will restart automatically. If you add Polled Data directly on the collector’s machine, you’ll need to restart the collector in order to pick up that data.

    Viewing the Output of your Polled Data

    In the Servers Page

    In the Servers page, click on a server, and then click on the “Polled Data” tab at the end. This is where you can view all of your Polled Data for a server.

    Hover over a message to see the complete entry. You can also click on the message to graph that script/command in Correlate.

    View Polled Data

    In the Dashboard

    To add your Polled Data to a Dashboard, you can use the Polled Data Metric widget or the Multi-Value Widget.

    Polled Data Widget

    Multi Value Widget

    In Correlate

    In Correlate, click on any number of Polled Data scripts/commands from the Polled Data column to graph them in Correlate. Click on a point in the graph to view all the messages at that minutes for all your graphed Polled Data.

    Correlate Polled Data

    In Alerts

    You can view any of your Polled Data alerts in the Alerts page.

    Using Nagios Plugins in AppFirst

    Nagios plugins are available to collect some data that AppFirst doesn’t automatically collect, such as database specifics and information about particular components (such as Websphere, Citrix Metaframe). You can visit the Nagios Exchange to browse the library of plugins. We support any plugin available.

    Nagios normally requires an entire dedicated server to run plugins. But with AppFirst, our collector automatically picks up the output of Nagios plugins you have running on that server. This means you no longer need a dedicated Nagios server to run your plugins. If you’re already using Nagios in your environment, you can take all of your plugins and add them into the Polled Data Config File so you don’t have to reconfigure anything.

    Using Nagios Plugins and Linux

    If you’re using Linux and you don’t have Nagios plugins set up, you will see “No Polled Data Found” in the Polled Data sections of our product. You will also see this message if your Nagios plugins are not set up in the default directory (/usr/local/nagios/etc/nrpe.cfg). If you have your plugins set up in a different location, you can update the directory in the Edit section for a collector. Wait a few minutes for the change to get pushed to the collector. Then you must manually restart the collector using the command: sudo /etc/init.d/afcollector restart (we realize it’s inconvenient to have to manually restart the collector and we are working on automating this process).

    For demonstration purposes, we will use the example of the Nagios plugin check_apache_load.pl.

    • Download or copy the Nagios plugin check_apache_load.pl to the AppFirst folder for the corresponding collector. This folder is /usr/share/appfirst/plugins/libexec/
    • Use the Command Line to execute the script:
      check_apache_load.pl

      and verify:

      • That it works
      • The needed Command Line parameters

    *Note: The Command Line parameters will be needed when you modify the Polled Data Config File.

    Using Nagios Plugins and Windows

    For demonstration purposes, we will use the example of the Nagios plugin check_disk.vbs.

    • Download or copy the Nagios plugin check_disk.vbs to the AppFirst folder for the corresponding collector.
    • Use the Command Line to execute the script check_disk.vbs and verify:
      • That it works
      • The needed Command Line parameters

    Windows - Nagios

    *Note: The entire Command Line parameters will be needed when you modify the Polled Data Config File.

    *Note: The plugin below is cutoff so here it is in full:

    command[check_disk]=C:\Windows\System32\cscript.exe //Nologo “C:\Program Files\Appfirst\scripts\check_disk.vbs” /w:30 /c:10 /d:C

    Windows - Edit Nagios

    MySQL Nagios Plugin

    There are at least 12 plugins available that provide various details from a MySQL instance. The specific plugin or plugins to be used will depend on the information to be extracted from a MySQL instance. Refer to http://exchange.nagios.org/directory/Plugins/Databases/MySQL for a list of available plugins.

    In order to extract details from a MySQL instance follow these steps:

    1. Download the required MySQL plugin from http://exchange.nagios.org/directory/Plugins/Databases/MySQL
    2. Install the plugin in /usr/share/appfirst/plugins/libexec
    3. Refer to the documentation with the plugin for details describing how to execute
    4. Test by executing the plugin from the install directory
    5. When you have determined the appropriate command line parameters apply these to the collector Polled Data Config File

    Below is an example using the MySQL table status plugin:

    % cd /usr/share/appfirst/plugins/libexec
    % ./checkMySQLTableStatus.py -u root -p appfirst -m rows,data_length,index_length,data_free,auto_increment -w 100M,50G,50G,500M,2G

    Creating your own custom Nagios plugins

    Nagios plugins use a very simple string format:
    Metric_Name Status: results | performance_data

    Where:

    • Status is OK, CRIT or WARN
    • metric_name is the name of the metric being monitored
    • Results are optional values reported from your plugin
    • Performance data is optional as defined below

    AppFirst will display results of the Status values as well as allow you to create alerts on the status values. We also graph your performance data and keep historical data for all plugins results.

    Here is an example of plugin output:
    PING ok - Packet loss = 0%, RTA = 0.80 ms |
    percent_packet_loss=0, rta=0.80

    For Linux

    You are able to both read plugin data from APIs as well as manage configuration of plugins from APIs.

    Here is an example command line for a simple plugin:
    /usr/local/nagios/libexec/check_load -w 15,10,5 -c 30,25,20

    Results from the above command line:
    OK - load average: 0.28, 0.16, 0.11|load1=0.280;15.000;30.000;0; load5=0.160;10.000;25.000;0; load15=0.110;5.000;20.000;0;

    Performance data is defined as “everything after the | of the plugin output.” It takes the form of:
    'label'=value[UOM];[warn];[crit];[min];[max]

    Where:

    • UOM is Unit Of Measure
    • Time as s, ms, us
    • Size as B, KB, MB, GB & TB
    • C is a continuous counter
    • No unit assumes an number, int, float

    A good reference for plugin formats can be found here: http://nagios.sourceforge.net/docs/3_0/perfdata.html

    JMX Plugin

    If you are using Version 67 or older versions of the Linux Collector, please see here.

    How To Set Up JMX

    In order to use this – you need to have a version 68 or higher of our collector. With V68 of the collector or newer, you don’t have to do anything to start this new process. It will happen when the collector is started/stopped if you have JMX commands enabled in your Polled Data Config File.

    If you need to though, you can manually start the JMX process as follows:

    To Start

    % sudo /usr/share/appfirst/plugins/libexec/jmx-collector/jmxcollector start

    To Stop

    % sudo /usr/share/appfirst/plugins/libexec/jmx-collector/jmxcollector stop

    Polled Data Config File

    The Polled Data Config file includes two statements to support JMX:

    1. jmx_command has the form:

    jmx_command[display-name-of-the-metric] options

    Required options:

    • -P pattern matching a unique string associated with the java command line.
      • ​The -P option should be a value from the java command line associated with the JRE from which you are getting managed beans over JMX. This is defined by the SUN API used to connect to JMX. The point of this, as used by the SUN API, is to provide a unique string that identifies the JRE to connect to.
      • A ‘ps -ef | grep java’ command will, for example, display the processes and command lines for the available JREs on a server. This information is also available in the Servers page. If you have only one JRE running on a given server, then most any string from the command line parameters should work.
      • Here is an example that we use for an Hbase region server:

        jmx_command[sys.app.Hbase.requests] -P org.apache.hadoop.hbase.regionserver.HRegionServer -O hadoop:service=RegionServer,name=RegionServerStatistics -A requests
      • *Note that in the example above -P is the name of the initial class to start when the JRE runs the application. The initial class to start is the last item in the command line in almost every java command line.
    • -O object name the object name of a Java managed bean.
    • -A attribute name the attribute name of a Java managed bean.

    Object and attribute names for a Java managed beans can be obtained from application component documentation or by using a tool such as Jconcole.

    Example: a command to obtain compaction size from an Hbase region server would look like this:

    jmx_command[hbase.compactionSize] -P REGIONSERVER -O hadoop:service=RegionServer,name=RegionServerStatistics -A compactionSizeNumOps

    2. env_command has the form:

    env_command[JAVA_HOME | JAVA_USER]=name.

    The env_command is used to set environment variables for use by the JMX process. There are two environment variables currently supported:

    • JAVA_HOME. A JAVA_HOME variable is required. If not set the script used to start the JMX process will use a default path.
    • JAVA_USER.

    The following priorities are used for JAVA_HOME:

    1. Hard-coded JAVA_HOME variable in the start script overrides everything
    2. JAVA_HOME defined in the config file overides any environment variable. The form of a config element in the config file is:
    3. env_command[JAVA_HOME]=pathname
    4. An environment variable set using
    5. % export JAVA_HOME=pathname
    6. If nothing is set we try /usr/java/latest

    *Note: This may not work, just taking a shot if nothing else is available. It does work on at least some versions of CentOS.

    A JAVA_USER variable is used to control the user account under which the JMX process will run. This is needed in version 1 of the AppFirst JMX support. It will not be needed in version 2.

    The JAVA_USER variable is needed in order for the JMX process to connect effectively with the JVM from which you want to extract metrics.

    The JMX process uses the Sun JMX API to connect to a JVM on the same host. There are numerous benefits of using the Sun API. One of the limitations is the need to be the same user as the JVM from which metrics are being gathered.

    Therefore, if a JVM is running as user hbase, for example, the JMX process will need to be run as user hbase. Unfortunately, user root will not be able to connect effectively to a JVM running as user hbase.

    This limitation will be removed in the second phase of the AppFirst JMX capability.

    The default is to start the JMX process as user root.

    The user can be controlled in one of two ways:

    1. Hardcoded USER variable in the start script
    2. Use an
      env_command[JAVA_USER]=name

      in your polled data config file

    This can be managed in the AppFirst web application in the Administration app. Here are a couple examples:

    env_command[JAVA_HOME]=/usr/java/latest
    
    env_command[JAVA_USER]=hbase

    If you change the polled data config file to modify either environment variable JAVA_HOME or JAVA_USER you will need to restart the JMX process. This is done as follows:

    % sudo /usr/share/appfirst/plugins/libexec/jmx-collector/jmxcollector restart

    Changes to any jmx_command[] entry in the polled data config file do NOT require and inteaction from the user. All changes will be recognized and applied automatically.

    Creating StatsD Metrics from JMX Data

    If a JMX polled data entry uses the prefix “metric.” in the name, both Polled Data and StatsD metrics will be created from the same data source. For example, the command jmx_command[hbase.compactionSize] will create polled data. The command jmx_command[metric.hbase.compactionSize] will create polled data in addition to StatsD metrics. Where StatsD metrics are created, the JMX information can be displayed in the Dashboard using a StatsD widget. This naming convention provides the ability to manage how JMX data is used, whether it’s just in the form of Polled Data or as Polled Data AND as StatsD metrics.

    Logging

    *Note: Logging does not require restart in Versions 1 or 2.

    The JMX process logs data to the file /usr/share/appfirst/plugins/libexec/jmx-collector/jmx-collector.log. All logs are disabled by defualt. In order to enable logs do the following:

    1. Edit the file /usr/share/appfirst/plugins/libexec/jmx-collector/log4j.properties
      The lines controlling log levels are:
      log4j.rootLogger=OFF, console, file
      log4j.logger.com.objectstyle.appfirst.jmx.collector=OFF
      Changing these to DEBUG or TRACE will provide details of the operation of the JMX process.
    2. Restart the JMX process
      % sudo /usr/share/appfirst/plugins/libexec/jmx-collector/jmxcollector restart

    Upgrading your JMX Collection Service

    % cd /usr/share/appfirst/plugins/libexec/jmx-collector/
    % git pull origin master
    % sudo /etc/init.d/afcollector restart
    % ps -ef | grep coll
    
    root     19029     1  2 18:03 ?        00:00:00 /usr/bin/collector
    
    hbase    19057     1  5 18:03 ?        00:00:00 /usr/java/latest//bin/java -cp .:/usr/java/latest//lib/jconsole.jar:/usr/java/latest//lib/tools.jar:appfirst-jmx-0.4-jar-with-dependencies.jar:appfirst-jmx-0.4.jar com.objectstyle.appfirst.jmx.collector.Application

    The ‘ps’ output, in the above example, shows that you are using jmx collector version 0.4.

    Windows Performance Counters

    We have included several performance counters by default with our collector installation. The file (using the default install folder) c:\program files\appfirst\appaccess\config\PerfCounters.txt, defines these counters. Entries starting with a “#” character are commented. In order to add a commented entry, simply remove the leading #. You can add entries for any Windows supported Perf Counter by creating an entry in the Polled Data Config File. After changes are made to the config, you will need to restart the AppAccess service in order to see the changes.

    For IIS:

    These are the default counters we define. Enable them by removing the leading #.
    #Web Service(_Total)=Bytes Total/sec -w 2000 -c 1000
    #Web Service(_Total)=Total Method Requests/sec -w 500 -c 200
    #Web Service(_Total)=Current Connections -w -c 1
    #Web Service Cache=File Cache Hits % -w 80 -c 70
    #Web Service Cache=Kernel: URI Cache Flushes -w 10 -c 5

    Refer to the following URL for IIS specific perf counters:
    http://www.microsoft.com/technet/prodtechnol/WindowsServer2003/Library/IIS/7898b860-462c-4846-a3a8-1179f287ad88.mspx?mfr=true

    For OS:

    http://technet.microsoft.com/en-us/library/bb734903.aspx
    http://technet.microsoft.com/en-us/library/cc776490(WS.10).aspx
    http://technet.microsoft.com/en-us/library/cc782186(WS.10).aspx

    For DB (SQLServer):

    http://technet.microsoft.com/en-us/library/ms159650.aspx
    http://msdn.microsoft.com/en-us/library/ms189931.aspx
    http://msdn.microsoft.com/en-us/library/aa972242(SQL.80).aspx

    How Windows Performance Counters work on AppFirst

    If you’re using Windows, AppFirst will automatically collect the Performance Counters data. This data will show up automatically in the Polled Data sections of our product.

    To change the thresholds on your Windows Performance Counter alerts (or to add new alerts), login to your server. Then open the file PerfCounters.txt in the AppFirst\AppAccess\conf folder, make the edits you want, and save the file. You must restart the AppAccess service to make the changes take effect. To do so, go to “Administrative Tools – Services,” select “AppAccess Service,” and click Restart.