Fork me on GitHub
» Making the world a better place, one byte at a time…

Encrypted client side PHP sessions

March 18th, 2013 by Marcus Povey

php Those of you who have done any amount of web application programming should all be familiar with the concept of a Session, but for everyone else, a Session is a way of preserving state between each browser page load in the inherently stateless web environment.

This is required for, among other things, implementing the concept of a logged in user!

In PHP, this is typically implemented by setting a session ID to identify a user’s browser and saving it as a client side cookie. This cookie is then presented to the server alongside every page request, and the server uses this ID to pull the current session information from the server side session store and populate the $_SESSION variable.

This session data must, obviously, be stored somewhere, and typically this is on the server, either in a regular file (default in PHP) or in a database. This is pretty secure, and works well in traditional single web server environments. However, a server side session store presents problems when operating in modern load balanced environments, like Amazon Web Services, where requests for a web page are handled by a pool of servers.

The reason is simple… in a load balanced environment, a request for the first page may be handled by one server, but the second page request may be handled by an entirely different server. Obviously, if the sessions are stored locally on one server, then that information will not be available to other servers in the pool.

There are a number of different solutions to this problem. One typical solution is to store sessions in a location which is available to all machines in the pool, either a shared file store or in a shared database. Another technique commonly used is to make the load balancer “session aware”, and configure it to address all requests from a given client to the same physical machine for the duration of the session, but this may limit how well the load balancer can perform.

What about storing sessions on the client?

Perhaps a better way to handle this problem would be to store session data on the client browser itself? That way it could send the session data along with any request, and it wouldn’t matter which server in the pool handled the request.

Obviously, this presents some issues, first of which being security. If session data is stored on the client, then it could conceivably be edited by the user. If the session contains a user ID, and this isn’t protected, then this could let a user pretend to be another. Any data stored on the client browser must therefore be protected, typically this is done by the use of strong encryption.

So, to begin with we need to replace the default PHP session handler with one of our own, which will handle the encryption and cookie management. The full code can be found on GitHub, via the link below, but the main bits of the code to pay attention to are saving the session:

// Encrypt session
$encrypted_data = base64_encode(mcrypt_encrypt(MCRYPT_BLOWFISH, self::$key, $session_data, MCRYPT_MODE_CBC, self::$iv));

// Save in cookie using cookie defaults
setcookie($session_id, $encrypted_data);

Followed by reading the session back in:

return mcrypt_decrypt(MCRYPT_BLOWFISH, self::$key, base64_decode($_COOKIE[$session_id]), MCRYPT_MODE_CBC, self::$iv );

If things are working correctly, you should see something similar to this:

Cookie Sessions

Limitations

There are some limitations to this technique that you should be aware of before trying to use it.

Firstly, the total size of all cookies varies from browser to browser, but is limited to around 4K. Remember, this is the total size of all cookies, not just the encrypted session, so your application should only store a bare minimum in the session.

Secondly, and this relates somewhat to the previous point, since the session is sent in the HTTP header on every request, you could eat up bandwidth if you start storing lots in the session. Another reason to store the absolute minimum!

Thirdly, session saving is done using a cookie, so you must be done with sessions, and call session_write_close(); explicitly, before you send any non-header output to the browser.

Have fun!

» Visit the project on Github…

Gun for hire!

February 20th, 2013 by Marcus Povey

It’s a brand new year!

Well, it’s been a new year for a little while actually, but 2013 has been a busy one so far. I’ve been working hard on some interesting things, but I still managed a sneaky skiing trip.

2012 was an awesome year; I welcomed it in rawkus style with my former housemates, and then a few days later, after the hangover had cleared of course, flew my first passengers as a newly qualified pilot!

I went to birthday parties, ate some great food, climbed, and enjoyed the company of some great people. I played Capoeira with my group at the Oxford Olympic Torch event, but otherwise managed to miss the worst of the Olympics by camping in the Czech wilderness followed by some epic climbing in Italy.

I have some big plans in motion for 2013, hopefully I’ll be able to dial up the awesome a few more notches! I want to finally get to grips with a foreign language, and ideally live abroad for a while in the native country. I want to progress my flying career in some way, advance to more complicated aircraft or perhaps do an aerobatic qualification. I intend to see more of the world, and climb more mountains (both figuratively and literally!).

Work wise, I’m working on a few exciting things (some of which will see the light of day really soon). As an FYI, I’m always interested to hear about your projects, especially if you need some technical and strategic muscle to help you!

Lets go!

Optimising your LAMP stack for performance

May 8th, 2012 by Marcus Povey

By default, the standard LAMP (Linux Apache Mysql Php/Perl/Python) stack doesn’t come particularly well optimised for handling more than a trivial amount of load. For most people this isn’t a problem, either they’re running on a large enough server or their traffic is at a level that they never hit against the limits.

Anyway, I’ve hit against these limits on a number of occasions now, and while there are many good articles out there on the subject, I thought I’d write down my notes. For my own sake as much as anything else…

Apache

Apache’s default configuration on most Linux distributions is not the most helpful, and you’re goal here is to do everything possible to avoid the server having to hit the swap and start thrashing.

  • MaxClients – The important one. If this is too high, apache will merrily spawn new servers to handle new requests, which is great until the server runs out of memory and dies. Rule of thumb:

    MaxClients = (Memory - other running stuff) / average size of apache process.

    If you’re serving dynamic PHP pages or pull a lot of data from databases etc the amount of memory a process takes up can quickly balloon to a very large value – sometimes as much as 15-20mb in size. Over time all running Apache processes will be the size of your largest script.

  • MaxRequestsPerChild – Setting this to a non-zero value will cause these large spawned processes to eventually die and free their memory. Generally this is a good thing, but set the value fairly high, say a few thousand.
  • KeepAliveTimeout – By default, apache keeps connections open for 15 seconds waiting for subsequent connections from the same client. This can cause processes to sit around, eating up memory and resources which could be used for incoming requests.
  • KeepAlive – If your average number of requests from different IP addresses is greater than the value of MaxClients (as it is in most typical thundering herd slashdottings), strongly consider turning this off.

Caching

  • SquidSquid Reverse Proxy sits on your server and caches requests, turning expensive dynamic pages into simple static ones, meaning that at periods of high load, requests never need to touch apache. Configuration seems complex at first, but all that is really required is to run apache on a different port (say 8080), run squid on port 80 and configure apache as a caching peer, e.g.


    http_port 80 accel defaultsite=www.mysite.com vhost
    cache_peer 127.0.0.1 parent 81 0 no-query originserver login=PASS name=myAccel

    One gotcha I found is that you have to name domains you’ll accept proxying for, otherwise you’ll get a bunch of Access Denied errors, meaning that in a vhost environment with multiple domains this can be a bit fiddly.

    A workaround is to specify an ACL with the toplevel domains specified, e.g.

    acl our_sites dstdomain .uk .com .net .org

    http_access allow our_sites
    cache_peer_access myAccel allow our_sites

  • PHP code cache – Opcode caching can boost performance by caching compiled PHP. There are a number out there, but I use xcache, purely because it was easily apt-gettable.

PHP

It goes without saying that you’d probably want to make your website code as optimal as possible, but don’t spend too much energy over this – there are lower hanging fruit, and as a rule of thumb memory and CPU is cheap when compared to developer resources.

That said, PHP is full of happy little gotchas, so…

  • Chunk output – If your script makes use of output buffering (which Elgg does, and a number of other frameworks do too), be sure that when you finally echo the buffer you do it in chunks.

    Turns out (and this bit us on the bum when building Elgg) there is a bug/feature/interaction between Apache and PHP (some internal buffer that gets burst or something) which can add multiple seconds onto a page delivery if you attempt to output large blocks of data all at once.

  • Avoid calling array_merge in a loop – When profiling Elgg some time ago I discovered that array_merge was (and I believe still is) horrifically expensive. The function does a lot of validation which in most cases isn’t necessary and calling it in a loop is ruinous. Consider using the “+” operator instead.
  • ProfileProfile your code using x-debug, find out where the bottlenecks are, you’d be surprised what is expensive and what isn’t (see the previous point).

Non-exclusive list, hope it helps!

All content is © Copyright Marcus Povey 2008-2013 and released under a Creative Commons licence unless otherwise stated.