This post, requested by Ben Werdmuller, pulls together a number of earlier posts in order to better document the federated, cross platform, friend/follow and signon mechanism stuff I’ve been hacking on recently. It’ll summarise the posts together with my latest thoughts, although I do encourage you to read the originals as well, since there’s a fair amount of detail there.

Federated/distributed social networking is something I (and many other people) have been kicking around for a little while. When working on Elgg, I was involved in a bunch of conversations where we explored getting the various Elgg sites to talk to each other, but it never really got anywhere at the time.

Times move on, and now I think we have a chance to really get somewhere; kicking about the Known code has given me a nice experimental platform to play with and there are now some distributed social tools and protocols that are seeing wide adoption (PuSH, MF2 etc), which is going to be very helpful.

Post Snowden of course, it is now clear that target dispersal, combined with widespread encryption, is required to keep our private lives safe from being spied on. Getting our everyday social interactions out of a centralised data-mining facility is now a basic requirement to safeguard our essential liberties.

Initial requirements

Going into this then, I wanted to start building the parts of a distributed social network, and I wanted to set some loose guidelines of what I’d like to see.

  • Distributed: There should be no central server anywhere in the ecosystem. Ideally transactions should occur peer to peer between nodes, rather than be orchestrated by a central body.
  • Cross platform: I don’t want to mandate the use of one specific platform. You can’t call yourself a distributed/federated social network if you can only federate between nodes running the same software! That’s a monoculture, and we know those are bad.
  • Simple, open, protocols: I don’t want to spend days building this, and if necessary I want to be able to test using the command line and CURL.
  • URLs, from a UX standpoint, are a bad way to identify people (lessons learnt from OpenID). I may need to reference user profiles by URL, but every time you force someone to type one in, God kills a kitten.

Friending and profile discovery

Original posts here and here.

The first step towards building a social network of any kind is to have the ability to add your friends to your network, and a distributed network is no different.

Here, and in my reference implementations on Known and Elgg, I am adopting the uni-directional “follow” idea of friendship (like Twitter follow) rather than the omni-directional transactional Facebook model, since this was the minimum I needed to make this work, and in my mind at least, better fitted how “friending” works in the real world.

So then, friending works by having an endpoint on your site which is passed the URL you want to add as a friend. To make this easy, and to avoid typing URLs, both my reference Known and Elgg implementations contain a bookmarklet which you can add to your browser button bar.

Alice visits Bob’s website or profile and clicks on the button. Alice’s site then retrieves Bob’s site and parses it for whatever user details can be found on the page – looking for name, profile picture and the URL of their profile. This is made possible through the use of Microformats, especially MF2.

Microformats are simple bits of markup that are invisible to someone who just looks at a webpage, but which allows a computer to understand the meaning of things on a page, for example, to understand that a certain bit of text is a person’s name, or that one link is a link to their profile picture and another link is their profile url. Additionally, since this is just text on a page, there is no requirement for that page to be “special” in any way, i.e. it could just be a static page, there is no requirement for special headers or the page to be the output of a script.

Here is an example of how a user may be marked up:


This markup can then be easily processed using one of the many libraries out there; if you’re using PHP I highly recommend Barnaby Walters PHP-MF2 library. In the above example I create a block, that I say identifies as a person (h-card), then details their photo, full name, email address and a url relating to them. This is probably enough information to be getting on with, but you can of course extract more profile/user information, if the markup is there.

Since a given page may contain multiple marked up people (especially if Alice clicks the “add friend” button while on a news feed), my reference implementations present a list of users which may be added, after first removing any duplicates (based on the URL of their profile), and you are also given the opportunity to fill in or amend any scraped information. If more than one URL is given for an entry, you should reconcile this by some mechanism in some way – I just render this as a dropdown in order to give Alice the choice of Bob’s primary profile, but I’m sure there’s a cleverer way.

Once Alice is happy, she can add Bob as a friend, and her site can do any post friending stuff – subscribing to Bob’s PuSH endpoint (if one is specified), or generating access credentials for Bob.

So, in summary, distributed friending works like this:

  1. Alice sends Bob’s page URL to her magic friending endpoint (using a browser bookmarklet)
  2. Alice’s site examines the URL for MF2 marked up h-card entries
  3. Alice is presented with a unique list of h-card entries (where uniqueness is defined on normalised profile URLs).
  4. Alice adds Bob as a friend and triggers any post friend events

Listening to Bob

After Alice adds Bob as a friend, she wants to be told when Bob updates his site. In Known this is accomplished by performing a Pubsubhubbub discovery and subscribe when the “friend” event is triggered (step 4 above).

I won’t go into too much detail as to how a PuSH subscription handshake works, since there’s more complete implementation information in the spec, but in summary, when Alice successfully adds Bob as a friend, her site does the following:

  1. Alice’s site looks for a feed URL on Bob’s site.
  2. Her site retrieves that url and looks in it for a “self” link (which is the canonical permalink for the feed of Bob’s updates).
  3. Then her site looks at this URL again and looks for any declared PuSH hubs to which to subscribe.
  4. If found, her site places a marker that she is subscribing to this hub in memory, then makes a subscription request.
  5. Bob’s hub at some point later will ping Alice’s PuSH endpoint with a success or failure message.
  6. Alice’s PuSH endpoint matches this request with the list of requests she’s made, and if the security tokens match up she can say she is subscribed.

Once subscribed, Alice’s endpoint will be pinged by Bob’s hub every time he makes an update. Alice’s site can then decide what to do with that information; perhaps Alice can use it to maintain a news feed, or send out an email update, whatever.

Friend only/private posts & friend signon

Original posts here and here, here and finally here.

So far, all we’ve really done is create a fancy RSS reader. The next step in creating a truly distributed social network is to have the ability to create posts which only your friends (or a selected subset of your friends) can see, but that the wider internet can not.

On centralised social networks this is trivial, since all users are local and can be identified in one of the many time honoured and straight forward ways, and once identified, content that they’re not permitted access to can be easily hidden. On a distributed social network, this becomes much more difficult.

Fundamentally, it’s a problem of credential exchange.

There are many techniques you could deploy to solve this problem, and most of them are not mutually exclusive. One approach might for Alice’s site to generate a random password and email it to Bob (since we likely have Bob’s email address from his h-card). Personally, I don’t think this is terribly clean.

So, I humbly put forward my thoughts on using OpenPGP keys as an identity mechanism

OpenPGP signin

My spec for this can be found in these two posts, but in short it works as follows:

  • Bob generates / adds a pgp key pair to his profile, and publicises his public key in one or more of the following ways *(discussion: Bob’s site needs access to a private key in order to generate signatures, therefore this key material should be kept secure. It may be that it’s best to generate a new keypair for exclusive use by Bob’s site, but I do kind of like tying together Bob’s profile and Bob’s email and identifying both cryptographically with the same key)*
    1. Via a HTTP Link header, with a rel of “key”, e.g. Link: https://example.com/bob/pubkey.asc; rel="key"
    2. Via a META tag in the HTTP header, e.g. <meta href="https://example.com/bob/pubkey.asc" />
    3. Via an anchor tag within the page body of rel=”key”, e.g. <a href="https://example.com/bob/pubkey.asc" rel="key">My Key</a>
    4. By pasting the key into the body of the page, and giving it a class of “key”, e.g.

      <pre class="key">
      -----BEGIN PGP PUBLIC KEY BLOCK-----
      ....
      -----END PGP PUBLIC KEY BLOCK-----
      </pre>

  • When Alice successfully adds Bob as a friend, her site attempts to extract the public key from his page. If found, her site saves the public key against Bob’s newly created user.

Now, some time later Alice creates a post, and she only wants Bob to be able to see it, so she…

  • Creates a new post, and adds Bob’s user to the ACL.
  • Bob’s site is notified by PuSH that Alice’s site has been updated, if Bob has also added Alice as a friend (because it’s a private post, we don’t push content, although conceivably we could encrypt the content with the public key of Bob, and whoever else has access. This bit is a little out of scope at the moment)
  • Bob visits Alice’s site and identifies himself by clicking on a bookmarklet. This bookmarklet passes the URL of Alice’s site back to Bob’s site which produces a signed request and sends it back.
  • Alice’s site verifies that the signature is valid, and that it was signed by the key belonging to Bob.

The signature sent by Bob’s site is formed over a message containing:

  1. The current date and time in ISO8601 format, as produced by date('c', time()); in PHP
  2. Bob’s profile URL
  3. The URL of the resource on Alice’s site that Bob is requesting.

Alice’s site should, on receiving this:

  1. Verify the signature is valid and the contents unmodified.
  2. Verify that it was signed by Bob’s key.
  3. Verify that the resource being requested is on Alice’s site.
  4. Check the timestamp is valid and within an acceptable range from now.
  5. Store the timestamp + profile url + resource url together and use it as a nonce to guard against replay attacks.
  6. Check that we’ve not seen this request before by querying the nonce generated in step 5 against the nonce store.

If all the above passes, Alice’s site lets Bob access the restricted resource (and optionally, logs Bob in to the site, allowing him access to any other resources he has access to).

Moving forward

So far I’ve demonstrated this working in a small distributed social network comprised of Known users, Elgg users and WordPress users, as well as PGP signon from the same plus shell scripts and javascript.

Nothing here requires anything particularly special to get up and running, and I’m hopeful that after all this has been revved a few times it’ll be pretty robust.

I’d be interested in your thoughts!

Distributed social networks – tools that give you all the social and political benefits of the siloed networks (Google+, Facebook, etc), but without being a massive honey pot for surveillance and data mining, are, in my view, the way we should be heading.

In this model, public posts are easy (that’s just the web), but limiting posts so that they can only be seen by a limited number of your friends is somewhat harder. On Elgg, and similar systems, the standard solution was to make everyone create an account, and profile, on your node. This is, to a large extent, the traditional approach, but basically ends up with you having multiple profiles around the internet (with multiple passwords to remember) which are, crucially, controlled by a third party.

This is a bad thing, and in the post Snowden world, a downright dangerous thing.

I’ve previously discussed a possible approach to providing distributed signon using OpenPGP keys as identity mechanism, and I’ve finally got around to fleshing this out, and building a prototype, now that distributed friending is in Idno/Known core.

Protocol overview

  • Two user profiles, Alice and Bob
  • Alice and Bob generate, or otherwise associate, a PGP key pair with their users (for the most part, only public keys are used in this. You only need to store the private key on the server if you’re automating the process of signing in, and if you can store your private key in your browser, there is eventually no need to store private keys on the server).
  • Alice adds Bob as a friend, and Alice’s site visits Bob’s profile for his public key (see “Public key discovery” below)
  • Rinse, repeat, for Clare, Dave, Emma, Fred, etc…
  • Alice writes a post, and only wants Bob to see it. She lists Bob’s profile URL as an approved viewer.
  • Bob visits the private post, and identifies himself by signing his profile URL with his key, and then POSTing the ascii armoured signature as signature to the post URL.
  • Alice verifies the signature, and confirms that the key’s fingerprint belongs to Bob’s key, and if so, lets Bob see the post.

Public key discovery

Bob makes his public key available by putting it on his web server, and making it easily discoverable to Alice in one or more of the following ways:

  1. Via a HTTP Link header, with a rel of “key”, e.g. Link: https://example.com/bob/pubkey.asc; rel="key"
  2. Via a META tag in the HTTP header, e.g. <meta href="https://example.com/bob/pubkey.asc" />
  3. Via an anchor tag within the page body of rel=”key”, e.g. <a href="https://example.com/bob/pubkey.asc" rel="key">My Key</a>
  4. By pasting the key into the body of the page, and giving it a class of “key”, e.g.

<pre class="key">
-----BEGIN PGP PUBLIC KEY BLOCK-----
....
-----END PGP PUBLIC KEY BLOCK-----
</pre>

Identifying Bob

When Bob wants to see the post that Alice has made, he identifies himself by making a POST request to that page, containing a signed URL of his profile. Alice then verifies the profile URL against those she as allowed access, and verifies that the signature is both correct and that the fingerprint belongs to Bob’s key.

Alice may want to store these access details in a session so she can give Bob access to other resources (logging Bob in, in effect), but this is not strictly necessary.

Other methods are available…

So, why not use OAuth, or signed HTTP requests?

Well first of all, all these authentication methods are not mutually exclusive, so there’s no reason why you can’t use multiple techniques.

Second, we’re using very standard tools (GPG, POST requests, etc), and standard formats, bolted together. Meaning, among other things, although this example (and the Idno implementation) uses a website to do the signing in, this isn’t really required. You can sign in and see a private post, just as easily, using curl and gpg from the command line, if you so require.

Finally, this is entirely distributed, and unlike some implementations of Oauth, or even things like IndieAuth, it requires no central authority to vouch for you. Update:Aaron points out that the latest versions of Indieauth don’t require a central authority.

Idno reference implementation

I have written a plugin that implements this protocol for Idno. In addition to the basic spec, the Idno plugin has the following enhancements, which you may want to consider as well.

Firstly, it uses OpenPGP.js to generate the keypair on the client machine. This preserves server entropy, making it better for hosted environments.

Secondly, the plugin provides you with a bookmarklet, which makes signing in to a compatible site nothing more than a button click.

Please kick both the Idno implementation and the overall spec about, and let me know what you think!

» Visit the project on Github...

The other day I sketched out some notes on how friend/follow and subscribe might work in a distributed social network such as Idno (I have since hacked together some plugins based on those notes).

So, I thought I’d sketch up some thoughts on how private and friend only posts might work in a distributed social network:

Outline specification

  • On account creation (or if there isn’t a key present) a key pair is generated, this key pair is used to identify a user’s profile to that user’s friends.
    • I’m not sure exactly what kind of key this should be at this point, although I’m leaning towards a PGP key pair, although OpenSSL has its merits (of course, there is no reason why we can’t use multiple technologies).
    • We’ll probably have to have the private key stored on the server for the purposes of signing, although there’s no reason why these have to be your main keys.
  • When Alice follows Bob, as described in my previous post, they both pull the public keys from each other’s profile as part of that exchange, which have been marked up using Microformats 2. Any keys found are saved against the record of that user.
    • For security, we probably want to do some sort of key validation here; perhaps key fingerprint, or perhaps better some web of trust based on mutual friends…
    • How key revocation might work is an open question, but I think the easiest way might be for Alice to send another subscription request to Bob, and have that re-trigger this process, rather than (as happens at the moment), returning an error that Alice is already subscribed to Bob.
  • When Bob writes a friends only post he lists Alice’s profile UUID in a list of people who can view the post, then ping’s Alice’s endpoint.
  • When Alice visits Bob, or Alice’s site visits Bob’s permalink, it identifies itself by signing the request using her key. If the signature is valid and belongs to a key for a user for which Bob has allowed access to the permalink, the data is displayed, otherwise a HTTP 403 code is returned.

Just some rough thoughts for now, let me know your thoughts!