Complex deployments require monitoring. It starts with a shell script that pings a remote server occasionally. Then you’ve got “hey I just rebooted” emails added to init.d. You might write some tests using wget or curl or expect (or even an automation tool like selenium to verify functionality.) It can get out of hand.
Let’s start off with a list of monitoring tools. I’m not endorsing anything here, just building a list and hoping for feedback:
- Nagios – very popular open source network monitoring
- Zenoss – newer network monitoring tool. uses Zope
- Hyperic – commerical, free basic version
- OpenNMS – open source network management platform
Additional tools
- Lilac Platform – used to configure Nagios
- Cacti – network graphing, often used with Nagios
SNMP is a standard protocol for checking network status. Everything from switches to SAN arrays can use SNMP to report their status. You can wrap JMX beans in SNMP. And you can write scripts that verify complex functionality and publish SNMP data, and then use your network monitoring tools to check status, send emails and pages, or take whatever action is needed.
Here’s another comparison.
http://poormanstech.blogspot.com/2007/01/alternatives-to-nagios.html
One comment in particular sums up a typical opinion:
OpenNMS -is good for networking gear.
ZENoss sucks…I couldn’t get what zenoss promises!
http://www.sage.org/lists/sage-members-archive/2008/msg00031.html
Nagios isn’t a monitoring system as much as it is an event reporting system…saying that Nagios is an event reporting system, you have to create events.
http://www.sage.org/lists/sage-members-archive/2008/msg00027.html
We did our evaluation and with all the touted features Zenoss came out on top. Deployment was much easier than Nagios, and setting up monitors was a breeze. Unfortunately as we dug deeper into
Zenoss functionality we ran into a number of problems. First, the feature set as documented just doesn’t seem to be there…Nagios is clunky, it is ugly, it is a pain to configure. It also works.
http://www.sage.org/lists/sage-members-archive/2008/msg00028.html
For a dissenting vote, [Zenoss has] been very good to us…it’s got some really nice features Nagios completely lacks, and is improving at a much more rapid pace, fueled by a much larger core developer team. The thing about Zenoss is that everything is centralized – you get syslog,
system check, snmpd checks, graphs, inventory, etc. all in one place, with one install… We no longer have to maintain a Cacti instance plus a Nagios instance plus inventories plus a Network monitoring instance plus a syslog parser like swatch
http://www.sage.org/lists/sage-members-archive/2008/msg00030.html
Zenoss is not nearly as useful if you don’t use snmp heavily
http://www.sage.org/lists/sage-members-archive/2008/msg00034.html
SNMP is nice for simple stats (routers, ethernet interfaces, disk usages, etc)…if you start to dig deeper into monitoring of services and service level management more and more specialized, agent-based tools come into action
http://www.sage.org/lists/sage-members-archive/2008/msg00040.html
OpenNMS is first and foremost a Network Monitoring system. It can autodiscovery and add new devices into to be monitored is your like…OpenNMS will do a
deep probe and findout what the device is. It has good knowledge of
various work equipment, and will automatically add interfaces or
services it finds to be monitored. It does know about a decent range of
network services (http, ftp, ssh, telnet, smtp, snmp, etc)…it’s system monitor is limited to what your can get out of SNMP via the HOST MIB
http://www.sage.org/lists/sage-members-archive/2008/msg00237.html
I had a test drive of OpenNMS. I share with you my findings:
http://technocrat.watson-wilson.ca/blosxom/computer/onmsreview.html
http://episteme.arstechnica.com/eve/forums/a/tpc/f/96509133/m/385006959831/inc/1
You might consider Zabbix…Zabbix won for it’s graphing and notification features.
I have recently written a hefty paper comparing primarily Nagios, OpenNMS and Zenoss, with a little on Cacti, MRTG and The Dude. The conclusion is a close run thing between Zenoss and OpenNMS but Zenoss wins – just. You can get the paper at http://www.skills-1st.co.uk/papers/jane/open_source_mgmt_options.pdf
– comments welcome.
Cheers,
Jane
Ms. Jane Curry, that’s a GREAT and detailed paper you’ve shared with us. Thank you!
I disagree with your own final choice — based on your comments and criteria OpenNMS really seems like the winner — but you’ve done a lot of the homework I would have liked to do, and I appreciate the paper. Thanks again.
(I only found this thread looking for more info on “TheDude”, which you really only skim over, but that’s okay.)