I manage a whole number of device and servers, which are monitored by various utilities, including Nagios. I also have clients who do the same, as well as using other tools that produce notifications – build systems etc.

Nagios is the thing that tells me when my web server is unavailable, or the database has fallen over, or, more often, when my internet connection dies. I have similar setups in various client networks that I maintain.

It logs to the system log, sends me emails, and in some urgent cases, sends a ping to my phone. All very handy, but isn’t very handy for other casual users who may just want to see if things are running properly. For those users, who are somewhat non-technical, it’s a bit much to ask them to read logs, and emails often get lost.

For one of my clients we had a need to be able to collect these status updates from different sources together, make it more persistent, and make it visible in a much more accessible way than log messages (which has a very poor signal to noise ratio) or email alerts (which only go to specific people).

“Known” issues

A solution I came up with was to create a Known site for the network which can be used to log these notifications in a user friendly, chronological and searchable form.

I created an account for my Nagios process, and then, using my Known command line tools, I extended the Nagios script to use my Known site as a notification mechanism.

In commands.cfg:

Then in conf.d/contacts.cfg I extended my “Root” contact:

Finally, the script itself, which serves as a wrapper around the api tools and sets the appropriate path etc:

Consolidating rich logs

Of course, this is only just the beginning of what’s possible.

For my client, I’ve already modified their build system to post on successful builds, or build errors, with a link to the appropriate logs. This particular client was already using Known for internal communication, so this improvement was logical.

The rich content types that Known supports also raises the possibility of richer logging from a number of devices, here’s a few thoughts of some things I’ve got on my list to play with:

  • Post an image to the channel when motion is detected by a webcam pointed at the bird feeders (again, trivial to hook up – the software triggers a script when motion is detected, and all I have to do is take the resultant image and CURL it to the API)
  • Post an audio message when a voicemail is left (although that’d require me to actually set up asterisk, which has been on my list for a while now)
  • Attach debugging info & a core dump to automated test results

I might get to those at some point, but I guess my point is that APIs are cool.

At home, which is also my office, I have a network that has a number of devices connected to it. Some of these devices – wifi base stations, NAS storage, a couple of raspberry pis, media centers – are headless (no monitor or keyboard attached), or in the case of the media center, spend their time running a graphical front end that makes it hard to see any system log messages that may appear.

It would be handy if you could send all the relevant log entries to a server and monitor all these devices from a central server. Thankfully, on *nix at least, this is a pretty straightforward thing to do.

The Server

First, you must configure the system log on the server to accept log messages from your network. Syslog functionality can be provided by one of a number of syslog servers, on Debian 6 this server is called rsyslog.

To enable syslog messages to be received, you must modify /etc/rsyslog.conf and add/uncomment the following:

# Provides UDP syslog reception
$ModLoad imudp
$UDPServerRun 514

# Provides TCP syslog reception
$ModLoad imtcp
$InputTCPServerRun 514

Then, restart syslog:

/etc/init.d/rsyslog restart

Although this is likely to be less of an issue for a local server, you should ensure that your firewall permits connections from your local network to the syslog server (TCP and UDP ports 514).

The Clients

Your client devices must be configured to then send their logs to this central server. The concept is straightforward enough, but the exact procedure varies slightly from server to server, and device to device. If your client uses a different syslog server, I suggest you do a little googling.

The principle is pretty much the same regardless, you must specify the location of the log file server and the level of logs to send (info is sufficient for most purposes). In the syslog configuration file add the following to the bottom:

*.info @192.168.0.1

On Debian/Ubuntu/Raspian clients, this setting is in the /etc/rsyslog.d/50-default.conf file.

Some embedded devices, like my Buffalo AirStation, have an admin setting to configure this for you. Other devices, like my Netgear ReadyNAS 2, has a bit more of an involved process (in this specific case, you must install the community SSH plugin, and then edit the syslog configuration manually).

Monitoring with logwatch

Logwatch is a handy tool that will analyse logs on your server and generate administrator reports listing the various things that have happened.

Out of the box, on Debian at least, logwatch is configured to assume that only log entries for the local machine will appear in log files, which can cause the reports to get confused. Logwatch does support multiple host logging, but it needs to be enabled.

The documented approach I found, which was to create a log file in /etc/logwatch/conf didn’t work for me. On Debian, this directory didn’t exist, and the nightly cron job seemed to ignore settings in both logwatch.conf and override.conf.

I eventually configured logwatch to handle multiple hosts, and to send out one email per host, but modifying the nightly system cronjob. In /etc/cron.daily/00logwatch, modify the execute line and add a --hostformat line:

#execute
/usr/sbin/logwatch --output mail --hostformat splitmail

After which you should receive one email per host logged by the central syslog server.

BE-UnlimitedOk, so some of the regular readers of this blog will sense a bit of a theme with my recent posts, and get the feeling that I’m essentially trying to graph the world.

Guilty.

Anyway, I get my home internet through Be. The ADSL line around here was a little flakey some time ago, and after going through the 3rd splitter in as many weeks I got a BT engineer out to sort out the local switching equipment. Things are working fine and dandy now, however I thought it would be cool to keep an eye on the ADSL modem’s key stats, just in case I had any more problems.

Getting started

In the 3rd party contributor repository, there is a plugin called BeBoxSync, written by Alex Dekker. The documentation for this plugin appears to have only existed on his website, which appears to no longer be available and has expired from google cache.

The plugin consists of two perl scripts which use an expect script to pull information from the ADSL modem via telnet. These original scripts may work out of the box for you, however for my BeBox (Thomson TG585v7 modem running software version 8.2.7.7), I needed to make some changes to get it to work. Basically, I needed to get the expect script to probe for extended information (which is no longer provided by the adsl info command), and to look in a slightly different part of the output for some required data.

My modifications are available on github, and the originals are here. I’d suggest you try my version first, as the originals haven’t been maintained for a fairly long time.

Installation

First, you must install expect. Expect is a little tool that lets you script interactive sessions like telnet. It is quite often installed on default installations, but is considered rather oldskool, so may not be (it wasn’t on my Debian 6 server)…

apt-get install expect

Next, after you have downloaded the scripts to somewhere sensible, you will need to make the following modifications:

  • Edit beboxstats.expect and enter the IP address of your modem and your administrator password in the appropriate places.
  • Edit beboxstats and enter the absolute path to the beboxstats.expect script.
  • Edit beboxsync and do the same.
  • Ensure all three scripts are executable by the munin user

You can test things are working by executing each script from the terminal. You should see a whole bunch of data about your modem when executing the expect script, and the values for each key field listed in munin plugin format when executing the munin scripts. Check these values against the values on your modem’s stats page (default: http://192.168.1.254/cgi/b/dsl/dt/?be=0&l0=1&l1=0) to verify that you are getting the correct values reported.

Finally, link to your scripts from within your munin plugins directory in the usual way. If things are working, you should see some new graphs appear in your “network” section.

beboxsync-day beboxstats-day

As you can see from these images, I monitored a sharp drop in line quality with a corresponding drop in bandwidth. Still investigating the cause…