Monday, August 2, 2010 | Posted by Jamie Duncan at 8:11 PM

Zabbix, oh how I love thee...

Unlike Configuration Management, Systems and Network monitoring have a TON of players in the field and even more opinions available on which one is the best one and why.  This is the 10k foot view of why we are going with Zabbix (http://www.zabbix.com) here at 5AM.

A Google search for "server monitoring" returns ~8.8 million results.  From that list we used the following (sometimes company-specific) rules to narrow it down a bit.

  • Open Source - 5AM stands firmly behind open-source software, and we use it (any try to contribute back to it) as much as we possibly can
  • Community Support - we're going to need some help from the community to get where we need to go.  We want that community to be involved and helpful, if not large.
  • Linux - We run linux servers (almost) exclusively, and want this app to run (well) on Linux
  • Postgres - we would like (but don't have to have) a database-agnostic, and if not that, then to be able to run with a Postgres backend
  • Networking AND Systems - while Cacti is great, it doesn't handle servers very well.  We want something that can do both.
  • Flexible, REALLY Flexible - While Nagios may call itself the standard, we think writing plugins for it STINKS.  We want to be able to quickly monitor anything we think of.
  • Low setup / maintenance overhead - We don't want to make a career out of maintaining our monitoring system.
  • Historical Data - We want it.
  • Graphs - We need them. Everybody is a visual learner when you're staring at 6 months worth of processor idle time values.
After taking all of these under consideration, we decided to go with Zabbix as our enterprise monitoring solution.

Some of the highlights:
  • The server was up and listening within 10 minutes.  1 config file to alter and you're rocking on the server end with hundreds of checks in dozens of default templates
  • The agent install is similarly easy.  One config file and it's off.
  • Anything you can output to the command-line can become a value to be evaluated and tracked. EASILY.
  • Had an ASA 5505 up and monitored for SNMP data in less than 10 minutes.
  • An increasingly well-performing web frontend.  speedy, pretty, and intuitive (as intuitive as any of these tools are these days)
  • LDAP integration was super easy
  • JSON-RPC based API with community supported Python bindings
Of course, nothing is perfect.  Some of the "aww man" moments with Zabbix:
  • While it works with PostgreSQL it's definitely aimed towards MySQL.  PostgreSQL can be a little bit squirrely.  But we're working on it.
  • It can't currently inherit LDAP users/groups
  • the frontend is in PHP.  We'd love to see a more powerful language for a web portal of such a powerful tool.
  • The API is great to push information into Zabbix.  Pulling information OUT? Not quite as easy or straightforward.

    0 comments:

    Post a Comment