Community

Zenoss Newsletter
Views

On a recent forum post I asked how to add a test for a process that runs too long. Under normal circumstances this process should start, run for five minutes, then end. If it runs for over an hour, there's a problem.

Here are steps on how I set this up:

  1. identify the process you are measuring
  2. run proctime.py process. This python script provided by Zen Master cluther (Thank you very much!). Returns the number of seconds process has been running
  3. shell script that saves output of proctime.py process to a file. Cron job calls this every five minutes. Some of the nice things about doing this via cron and not calling the shell script from snmpd: a) the shell script can test for NULL and write a 0 instead, and b) snmpd is not waiting on the command to execute (cat a file is faster than executing the command)
  4. snmpd uses cat to dump the contents of that file
  5. In Zenoss, build a template that queries for a new Data Source: 1.3.6.1.4.1.2021.8.1.101.2 a.k.a. UCD-SNMP-MIB::extOutput.2
  6. In Zenoss, on that template, build a Threshold and a Graph from the Data Source as you please. I wanted both.

The data below shows that when I query HOSTNAME I get a value of 16 seconds that PROCESS has been running. (UCD-SNMP-MIB::extOutput.2 = 16)

[zenoss@bby1ems01 ~]$ snmpwalk -v2c -c itsasecret \
HOSTNAME 1.3.6.1.4.1.2021.8.1 | grep '.2'
UCD-SNMP-MIB::extIndex.2 = INTEGER: 2
UCD-SNMP-MIB::extNames.2 = STRING: getPROCESS.proctime
UCD-SNMP-MIB::extCommand.2 = STRING: /bin/cat /var/net-snmp/PROCESS.proctime.seconds
UCD-SNMP-MIB::extResult.2 = INTEGER: 0
UCD-SNMP-MIB::extOutput.2 = STRING: 16
UCD-SNMP-MIB::extErrFix.2 = INTEGER: 0
UCD-SNMP-MIB::extErrFixCmd.2 = STRING:
[zenoss@bby1ems01 ~]$ 

I hope this helps.

David


comments:

Similar to How to Monitor a SW RAID device --david_sloboda, Mon, 27 Oct 2008 19:41:52 -0500 reply

http://www.zenoss.com/community/docs/howtos/how-to-monitor-a-software-raid-device/