I have recently been doing a lot of development work for a very large Known installation. This installation is highly customised, has many active users, all doing unexpected and creative things with the platform, and makes use of many of Known’s more advanced features in often quite unexpected ways.

As with everything built by mankind, sometimes things go wrong, which is especially true with something as complicated as software. Simply waiting for a user to report a fault, and for that report to bubble up through IT/management, is poor customer service. The time between the fault being found, and the report being received, is often measured in weeks, and is often misleading/missing crucial information, leading to more time spent clarifying the fault.

So, recently, I’ve been exploring a number of ways to handle any issues in a much more proactive way, and to collect objective data, rather than subjective fault reports.

Crash reports

One thing I’ve been exploring is a simple mechanism whereby Known will send an email to one or more addresses when a fatal error or exception occurs. This email contains the details of the error, as well as who was logged in at the time.

You can try this for yourself if you’re tracking Known’s master branch by adding the following to your config.ini:

This is great for when something blows up, but often problems are much more subtle than that. For example, what happens if a change you’ve made causes an increase in page load for certain users? How would you track back and find out when it started, and what was the change that might have caused it?

Running stats and health metrics

In the latest master build, I’ve added a mechanism to start collecting useful metrics of a running system – page build time, events tracking, instances of errors, etc – which when properly analysed will give a much more useful overview of the general health of a running system.

I’m a big fan of graphs over logs for this sort of thing.

Currently the stats handler is a dummy which throws this information away, however it’d be a simple matter to extend this functionality to use something RRDTool or StatsD, with Graphite over the top to generate the graphs.

Recording your stats

If you’re a plugin writer, you can push your own statistics using the same tool, e.g.:

Give it a play!

2 thoughts on “Collecting performance/health metrics in Known

Leave a Reply