At home, which is also my office, I have a network that has a number of devices connected to it. Some of these devices – wifi base stations, NAS storage, a couple of raspberry pis, media centers – are headless (no monitor or keyboard attached), or in the case of the media center, spend their time running a graphical front end that makes it hard to see any system log messages that may appear.

It would be handy if you could send all the relevant log entries to a server and monitor all these devices from a central server. Thankfully, on *nix at least, this is a pretty straightforward thing to do.

The Server

First, you must configure the system log on the server to accept log messages from your network. Syslog functionality can be provided by one of a number of syslog servers, on Debian 6 this server is called rsyslog.

To enable syslog messages to be received, you must modify /etc/rsyslog.conf and add/uncomment the following:

# Provides UDP syslog reception
$ModLoad imudp
$UDPServerRun 514

# Provides TCP syslog reception
$ModLoad imtcp
$InputTCPServerRun 514

Then, restart syslog:

/etc/init.d/rsyslog restart

Although this is likely to be less of an issue for a local server, you should ensure that your firewall permits connections from your local network to the syslog server (TCP and UDP ports 514).

The Clients

Your client devices must be configured to then send their logs to this central server. The concept is straightforward enough, but the exact procedure varies slightly from server to server, and device to device. If your client uses a different syslog server, I suggest you do a little googling.

The principle is pretty much the same regardless, you must specify the location of the log file server and the level of logs to send (info is sufficient for most purposes). In the syslog configuration file add the following to the bottom:

*.info @192.168.0.1

On Debian/Ubuntu/Raspian clients, this setting is in the /etc/rsyslog.d/50-default.conf file.

Some embedded devices, like my Buffalo AirStation, have an admin setting to configure this for you. Other devices, like my Netgear ReadyNAS 2, has a bit more of an involved process (in this specific case, you must install the community SSH plugin, and then edit the syslog configuration manually).

Monitoring with logwatch

Logwatch is a handy tool that will analyse logs on your server and generate administrator reports listing the various things that have happened.

Out of the box, on Debian at least, logwatch is configured to assume that only log entries for the local machine will appear in log files, which can cause the reports to get confused. Logwatch does support multiple host logging, but it needs to be enabled.

The documented approach I found, which was to create a log file in /etc/logwatch/conf didn’t work for me. On Debian, this directory didn’t exist, and the nightly cron job seemed to ignore settings in both logwatch.conf and override.conf.

I eventually configured logwatch to handle multiple hosts, and to send out one email per host, but modifying the nightly system cronjob. In /etc/cron.daily/00logwatch, modify the execute line and add a --hostformat line:

#execute
/usr/sbin/logwatch --output mail --hostformat splitmail

After which you should receive one email per host logged by the central syslog server.

It has been a few weeks since I finally received my Raspberry Pi, but up until today I have been too busy to play with it.

This changed today when I finally installed a boot image on a 4GB SD card, wired up the tiny little circuit board to the TV and connected the power. I was very gratified when my TV sprung into life and I was greeted with a booting linux system!

I had a little play before I had to be getting back to work, and first impressions were very positive. I opted for the recommended debian based image, a distribution I am very familiar with. Network and USB functioned straight out of the box, and I was even able to install a few packages.

I’m looking forward to tinkering with it some more!

By default, the standard LAMP (Linux Apache Mysql Php/Perl/Python) stack doesn’t come particularly well optimised for handling more than a trivial amount of load. For most people this isn’t a problem, either they’re running on a large enough server or their traffic is at a level that they never hit against the limits.

Anyway, I’ve hit against these limits on a number of occasions now, and while there are many good articles out there on the subject, I thought I’d write down my notes. For my own sake as much as anything else…

Apache

Apache’s default configuration on most Linux distributions is not the most helpful, and you’re goal here is to do everything possible to avoid the server having to hit the swap and start thrashing.

  • MaxClients – The important one. If this is too high, apache will merrily spawn new servers to handle new requests, which is great until the server runs out of memory and dies. Rule of thumb:

    MaxClients = (Memory - other running stuff) / average size of apache process.

    If you’re serving dynamic PHP pages or pull a lot of data from databases etc the amount of memory a process takes up can quickly balloon to a very large value – sometimes as much as 15-20mb in size. Over time all running Apache processes will be the size of your largest script.

  • MaxRequestsPerChild – Setting this to a non-zero value will cause these large spawned processes to eventually die and free their memory. Generally this is a good thing, but set the value fairly high, say a few thousand.
  • KeepAliveTimeout – By default, apache keeps connections open for 15 seconds waiting for subsequent connections from the same client. This can cause processes to sit around, eating up memory and resources which could be used for incoming requests.
  • KeepAlive – If your average number of requests from different IP addresses is greater than the value of MaxClients (as it is in most typical thundering herd slashdottings), strongly consider turning this off.

Caching

  • SquidSquid Reverse Proxy sits on your server and caches requests, turning expensive dynamic pages into simple static ones, meaning that at periods of high load, requests never need to touch apache. Configuration seems complex at first, but all that is really required is to run apache on a different port (say 8080), run squid on port 80 and configure apache as a caching peer, e.g.


    http_port 80 accel defaultsite=www.mysite.com vhost
    cache_peer 127.0.0.1 parent 81 0 no-query originserver login=PASS name=myAccel

    One gotcha I found is that you have to name domains you’ll accept proxying for, otherwise you’ll get a bunch of Access Denied errors, meaning that in a vhost environment with multiple domains this can be a bit fiddly.

    A workaround is to specify an ACL with the toplevel domains specified, e.g.

    acl our_sites dstdomain .uk .com .net .org

    http_access allow our_sites
    cache_peer_access myAccel allow our_sites

  • PHP code cache – Opcode caching can boost performance by caching compiled PHP. There are a number out there, but I use xcache, purely because it was easily apt-gettable.

PHP

It goes without saying that you’d probably want to make your website code as optimal as possible, but don’t spend too much energy over this – there are lower hanging fruit, and as a rule of thumb memory and CPU is cheap when compared to developer resources.

That said, PHP is full of happy little gotchas, so…

  • Chunk output – If your script makes use of output buffering (which Elgg does, and a number of other frameworks do too), be sure that when you finally echo the buffer you do it in chunks.

    Turns out (and this bit us on the bum when building Elgg) there is a bug/feature/interaction between Apache and PHP (some internal buffer that gets burst or something) which can add multiple seconds onto a page delivery if you attempt to output large blocks of data all at once.

  • Avoid calling array_merge in a loop – When profiling Elgg some time ago I discovered that array_merge was (and I believe still is) horrifically expensive. The function does a lot of validation which in most cases isn’t necessary and calling it in a loop is ruinous. Consider using the “+” operator instead.
  • ProfileProfile your code using x-debug, find out where the bottlenecks are, you’d be surprised what is expensive and what isn’t (see the previous point).

Non-exclusive list, hope it helps!