Yesterday, I wrote a post outlining a draft specification for a possible way to handle login on a distributed social network, together with a reference implementation for Known.

I got some really positive feedback, including someone pointing out a potential replay vulnerability with the protocol as it stands.

I admit I had overlooked replay as an attack vector (oops!), but since peer review is exactly why open standards are more secure than propriatory standards, I thought I’d kick off the discussion now!

The Replay problem

Alice wants to see something that Bob has written, so logs in according to the protocol, however Eve is listening to the exchange and records the login. She then, later, sends the same data back to Bob. Bob sees the signature, sees that it is valid, and then logs Eve in as Alice.

Worse, Eve could send the same packet of data to Clare and David’s site as well, all without needing access to Alice’s key.

Eve needs to be able to intercept Alice’s login session, which, if HTTPS has been deployed is largely impractical, but since this can’t always be counted on I’d like to improve the protocol.

Countermeasures

Largely, countermeasures to a replay attack take the form of creating the signature over something non-repeatable and algorithmically verifiable that Alice can generate and Bob can check.

This may be some sort of algorithmically generated hash, a timestamp, or even just a random number, or record whether we have seen a specific signature before.

My specific implementation has an additional wrinkle in that it has to function over a distributed network, in which each node doesn’t necessarily talk to each other (so we can’t check whether we’ve seen a signature or random number before, since Bob might have seen it, but Clare and David won’t have).

I also want to avoid adding too much complexity, so I’d like to avoid, if I can, doing some sort of multi-stage handshaking; for example hitting an endpoint on the server to obtain a random session id, then signing that and sending it back. Basically, I’d still like to be able to talk to a server using Unix command line tools (gpg) and CURL if I can!

Proposed revision

Currently, when Alice logs in to Bob’s site, Alice signs their profile URL using her key and sends it to Bob. Bob then uses this profile url to verify that Alice is someone with access to Bob’s site/post and then users the signature to verify that it is indeed Alice who’s attempting to log in.

What I propose, is that in addition to forming the signature over Alice’s profile URL, she also forms it over the URL of the page she is trying to see, and also the current time in GMT.

Including the requested URL in the signature allows Bob to verify that the request is for access on his site. If Eve sent this packet to Clare or Dave, it could be easily discarded as invalid.

Adding the timestamp allows Bob to check that this isn’t an old packet being replayed back. Since any implementation should have a small tolerance (perhaps a few minutes either side) to allow for clock drift, using a timestamp allows a small window of attack where Eve could replay the login. To counter this, Bob’s implementation should remember, for a short while, timestamps received for Alice and if the same one is seen twice invalidate all of Alice’s sessions.

  • Why invalidate all of Alice’s sessions when we see the same timestamp twice, can’t we just assume that the second packet is Eve?”

    Sadly not – sophisticated attackers are able to attack from a position physically close to you, so Eve’s login may be received first. In the situation where two identical login requests are received, it is probably safer to treat both as invalid.

    Perhaps a sophisticated implementation could delay Alice’s first login for a few seconds (after verifying) to see if any duplicates are received, and only proceed if there are none. This would limit the need to permanently store timestamps against a user’s account, but may be more complex from an implementation point of view.

  • Why use a timestamp rather than a random number?

    I was going back and forth on this… a random number (nonce) would remove the vulnerability window, but it would require Bob’s site to store every number we’ve seen thus far, so I finally opted not to take this approach.

I’d be interested in your thoughts, so please, leave a comment!

If you’re anything like me, you often find yourself needing to add create and modified timestamps to data stored in a MySQL database. Typically, I’d do this programmatically from within the application without giving it a second thought, wrapping up time(); in an SQL statement and firing it over.

Recently, while doing some development work for one of my clients, I was given the requirement to add Created and Modified timestamps to a whole bunch of existing data tables. These tables were referenced by hundreds of different MySQL queries, and to complicate matters further, we were in the middle of migrating the code over to a new database library.

I didn’t much fancy rewriting all of the SQL queries, potentially twice, just to add a couple of timestamps, so modifying code wasn’t much of an option. Thankfully, a couple of MySQL’s internal features came to the rescue.

Adding a ModifiedTime

Adding a modified timestamp to a table is the most straight forward. All your have to do is create the field of type TIMESTAMP, and by default, MySQL will automatically update the field when the row is modified.

There are a couple of things to be aware of:

  • While you can have multiple TIMESTAMP fields in a row, only one of these can be automatically updated with the current time on update.
  • If your UPDATE query contains a value for your ModifiedTime field, this value will be used.

So, to add your modified timestamp field to an existing table, all you need is:

ALTER TABLE my_table ADD ModifiedTime TIMESTAMP;

Adding a CreatedTime

Adding a CreateTime value is a little more involved.

On the latest versions of MySQL it is apparently possible to create a DateTime field with a default value of CURRENT_TIMESTAMP. This wasn’t an option for me as I was having to support a somewhat older version, besides, even on the newer versions of MySQL it is not possible to have more than one field using CURRENT_TIMESTAMP, which of course we are in order to get ModifiedTime working.

So, in order to get a created timestamp, firstly we must add a DATETIME field to the table.

ALTER TABLE my_table ADD CreatedTime datetime NOT NULL;

Note, that this must be created as NOT NULL in order for the next part to work (this is because setting NOT NULL forces an automatic all zeros default).

Next, we must create a trigger, which will automatically be fired when we insert a value into our table and set the created timestamp.

DELIMITER //
DROP TRIGGER IF EXISTS `my_table_insert_trigger`//
CREATE TRIGGER `my_table_insert_trigger`
BEFORE INSERT ON `my_table`
FOR EACH ROW
BEGIN
IF NEW.CreatedTime = '0000-00-00 00:00:00' THEN
SET NEW.CreatedTime = NOW();
END IF;
END;//
DELIMITER ;

Now, when you insert a value into the table, this trigger will fire and, if you’ve not provided a CreatedTime field in your insert query, it will be set to the current time stamp.

Conclusion

Since all the queries in the application code specified the columns they were updating on insert, I was able to use this method to add created and modified time stamp fields to all the existing object tables in the database, without needing to modify any of the existing application code.

This simplified my life greatly, but also suggests to me that this method might be somewhat more efficient than the methods I’d previously used.

Once upon a time, time was, by and large, governed by the sun. Least ways it used to be, but then came the railways. Y’see the trouble was that time in those days, or at least our measurement of it, was typically read off the sundial in the town square. Since the angle of the sun varies with latitude this meant that while it might be 2pm in London, it may be 2.30 in Edinburgh.

This local variation didn’t really matter in a world where the fastest form of transport available ran on grass and sugar lumps, but in the age of steam trains and telegraph, it started causing problems. If a train was due in at 10:15, did that mean London time or Birmingham time? So, the rail companies started using GMT (then known as “railway time”).

Now, we live in a connected world. It is now possible to communicate anywhere in the world instantly, and oceans which once would have taken months to cross can be crossed in hours. Many of us work in teams which span continents and time zones. Age old local time issues persist; a meeting scheduled for 3pm… is that GMT, Local (currently BST) or someone else’s local time? Where exactly is the person who scheduled this meeting anyway? They’re based in San Francisco, but their calendar says they’re flying to Austin today… so is that PST or Central?

But at least you’re always talking about hourly intervals, right? Not really, what about India (GMT+5:30), or China (GMT+8 everywhere even though the country spans about 5 timezones). Oh, and what about daylight saving? Or the fact that not everyone can agree as to when the change should occur, if at all.

You get my point.

Can’t we all just get along?

In the computer world at least, there is some hope for standardisation. In the Internet connected world it is standard practice to set the computer’s system time (that is, the computer’s base clock) to GMT/UTC and then store local time as an offset from that.

This is because Unix (which historically much of the internet was built on) stores time as a timestamp defined as the number of seconds from 1/1/1970, and is always GMT. This is nice and unambiguous, and will work fine at least until 2038.

Having the base clock set to GMT/UTC by definition makes it at least possible, to convert between times, but this is far from being user friendly. As a user, this is not something I should be worrying about.

Option 1: Abolish timezones

Essentially do what the military has been doing for years and give all times in GMT/UTC/Zulu.

While this idea does have a certain appeal (to me at least, as I spend large portions of my life dealing with UNIX timestamps), given the level of international cooperation it would require, it would almost certainly never happen. Local time, with all its weird eccentricities is here to stay, least ways until some future despotic world government takes over.

Option 2: Make software do a better job (Hint: This option is the correct solution)

Computers need to do a better job of handling time on behalf of the user.

While some apps, for example Google Calendar, do an ok job, the interface always feels a little clunky. Google calendar in particular is really awkward when scheduling meetings with people in other timezones. Where exactly are my work colleagues? When would be a good time to schedule a meeting given that it should ideally fall somewhere in the work day for all of us?

When I see a date on a web page I should be able to hover over the date and have the browser (or whatever bit of software I’m using) tell me what time it is where I currently am, and my collaborators should be able to do the same. On the web, this should be straightforward with HTML5’s new semantic elements and the penchant for modern devices to come with some sort of geolocation hook.

When I am collaborating with others I should be able to see where they are, and what time it is there, whenever I need to do anything involving time. Crucially, all this should happen naturally and without the user having to worry about it.

If I were to say “Lets talk at 3:15pm”, I think it would be safe for the software to assume I meant 3.15 where I am at the moment, as this is generally how humans think. If my colleague in New York sees this the software should automatically tell them that by this I’m referring to 10:15am local time. Of course it gets a bit more complicated if I omit the AM/PM, but the software could reasonably assume that I’m more likely to schedule something for day time than night, although this assumption should probably trigger a warning of some sort to highlight the ambiguity.

The more detail I provide, the better the fix, so if I specify the timezone, then of course the software should use that (and all who view it can see this translated to their current local time).

It’s an interesting question how one should handle the fairly common situation where someone says something like “Lets meet at 3:15pm your time. The software should probably pick this up, but if we’re currently talking to more than one person we’d need to retain an idea of context in order to make sense of it.

…and then there comes the problem of translating such pronouns into other languages. Ouch.

The point is, this is the kind of stuff software engineers should care about but nobody else should have to. Let’s do this better.