• Home
  • Consultancy
  • Contact
  • Multiple site support with MP’s Multisite Elgg

    April 19th, 2010 by Marcus Povey

    I have just Open Sourced an “itch scratching” project I’ve been hacking on for a little while. So, without much further ado, I’d like to introduce you to Marcus Povey’s Multisite Elgg!

    It is currently in Beta and the code could do with a bit of a tidy, but this is Open Source so roll up your sleeves and get involved.

    What is it?
    Multisite Elgg allows you to run multiple separate Elgg sites off of the same install of the codebase, saving disk space and making administration a whole bunch easier.

    Currently based around the latest Elgg 1.7 release, once installed adding new Elgg sites is a matter of clicking on a button and entering in some details.

    What can I do with it?
    You can do everything that you can do with Elgg, but with the ability to create new networks on demand. This will for example let you:

    • Set up your own version of Ning! What with Ning phasing out free accounts, it is my hope that Multisite Elgg will let a thousand more Nings bloom!
    • In your organisation or institution, easily set up Elgg sites for each department.
    • If your one of the Elgg hosting companies out there, you may want to look at multisite in order to simplify your work flow.
    • … etc…

    Installation
    Once you have downloaded the installation package you will need to do a few things in order to get up and running. Multisite Elgg assumes that you have some knowledge of how to set up and run a server – there is no wizard just yet!

    1. Unzip the package on your web server.
    2. Point your master domain at the contents of the install location on your web server. This is your master control domain, go here to configure your sites. Because of this you might want to consider putting this behind some further access restrictions.
    3. Point any sub domains to the contents of the docroot folder, eg (/var/multisite/docroot). This directory forms the base of all your Elgg installs. To make things even more automated you may want to consider making this an Apache wildcard domain, if your DNS provider supports it.
    4. Chmod 777 docroot/data: This is the default location for multisite domains.
    5. Install schema/multisite_mysql.sql: Create a new database on your Mysql server and install the Multisite schema – this is your master control database.
    6. Rename settings.example.php in docroot/elgg/engine/ to settings.php and configure:

      $CONFIG->multisite->dbuser = ‘your username’;
      $CONFIG->multisite->dbpass = ‘password’;
      $CONFIG->multisite->dbhost = ‘host’;

      Make sure this user has sufficient privileges to create and grant access to databases and tables on your server. This will allow the admin tool to create the databases for your hosted sites automatically.

    7. Visit your master domain and configure your admin user
    8. Begin configuring your sites!

    Creating sites
    Once you have created an admin user, adding sites is easy. Currently you can only create one type of site, but in the future Multisite Elgg will let you create sites which have quotas and other access restrictions.

    You have a box to enter database details, or you can leave them blank to use Multisite Elgg user defined above (which you may not want to do for security reasons).

    You can also select which of the installed plugins you want to allow, this lets have different sites have different plugins available while still installing them on the same codebase.

    Contributing
    So, that was a brief introduction to Multisite Elgg. I hope that at least some of you out there find it useful!

    As I said before, it’s Open Source, so if you want to get involved here are the important details:

    If you want to contribute patches, feel free to use the bug tracker or discussion forum!

    Enjoy!

    Akismet plugin for Elgg

    January 18th, 2010 by Marcus Povey

    I have just written a very small Akismet plugin for Elgg.

    When enabled and configured, this plugin will scan newly submitted comments of the ‘generic_comment’ annotation class.

    While spam comments are rarer on Elgg due to the fact that most sites don’t allow anonymous comments, this could be useful for people who are getting spam comments from people who have signed up.

    This plugin comes into its own when you allow anonymous comments, such as on a site I recently built for a client.

    Extending this plugin to scan other content should be fairly straight forward for even a novice coder, but if I have time I’ll provide an interface to do so.

    Anyway, go get it here, or check out the project page on google code!

    Image “Spam! [don't buy]” by David Trattnig

    Running Elgg on a MySQL cluster

    September 21st, 2009 by Marcus Povey

    I have recently been exploring some aspects of the Elgg scalability question by exploring how easy it would be to get the latest version of Elgg (1.6) running on a MySQL cluster.

    In this article I will document the process, but first I should point out:

    • This is highly experimental and not endorsed in any way.
    • It is built against Elgg 1.6.1
    • This is not canonical and doesn’t reflect anything to do with the roadmap
    • This has not been extensively tested so caveat emptor.

    Setting up the cluster

    The first step is to set up the cluster on your equipment.

    A MySQL cluster consists of a management node and several data nodes connected together by a network. Because I was running rather low on hardware, I cheated here and created each node as a Virtual Box image on my laptop – but the principle is the same.

    Each node is an Ubuntu install (although you can use pretty much any OS) with two (virtual) network cards, one connected to the wider network (to install packages) and another on an internal network. If you do this for real you should consider removing the internet facing card once you’ve set everything up since a cluster isn’t secure enough to be run on the wider internet.

    In my test configuration I had three nodes with name/internal IP as follows:

    • HHCluster1/192.168.2.1 – Management node & web server
    • HHCluster2/192.168.2.2 – First data node
    • HHCluster3/192.168.2.3 – Second data node

    HHCluster1 – The management node

    Install mysql, apache etc. This should be a simple matter of apt-getting the relevant packages. Clustering (ndb) support is built into the version of mysql bundled with Ubuntu, but this may not be the case universally so check!

    You need to create a file in /etc/mysql/ called ndb_mgmd.cnf, this should contain the following:


    [NDBD DEFAULT]
    NoOfReplicas=2 # How many nodes you have
    DataMemory=80M # How much memory to allocate for data storage (change for larger clusters)
    IndexMemory=18M # How much memory to allocate for index storage (change for larger clusters)
    [MYSQLD DEFAULT]
    [NDB_MGMD DEFAULT]
    [TCP DEFAULT]

    [NDB_MGMD]
    HostName=192.168.2.1 # IP address of this system

    # Now we describe each node on the system

    # First data node
    HostName=192.168.2.2
    DataDir=/var/lib/mysql-cluster
    BackupDataDir=/var/lib/mysql-cluster/backup
    DataMemory=512M
    [NDBD]
    # Second data node node
    HostName=192.168.2.3
    DataDir=/var/lib/mysql-cluster
    BackupDataDir=/var/lib/mysql-cluster/backup
    DataMemory=512M

    #one [MYSQLD] per data storage node
    [MYSQLD]
    [MYSQLD]

    Data nodes (HHCluster2 & 3)
    You must now configure your data nodes:

    1. Create the data directories, as root type:

      mkdir -p /var/lib/mysql-cluster/backup
      chown -R mysql:mysql /var/lib/mysql-cluster

    2. Edit your /etc/mysql/my.cnf and add the following to the [mysqld] section:

      ndbcluster
      # Replace the following with the IP address of your management server
      ndb-connectstring=192.168.2.1

    3. Again in /etc/mysql/my.cnf uncomment and edit the [MYSQL_CLUSTER] section so it contains the location of your management server:

      [MYSQL_CLUSTER]
      ndb-connectstring=192.168.2.1

    4. You need to create your database on each node (this is because clustering operates on a table level rather than a database level):

      CREATE DATABASE elggcluster;

    Starting the cluster

    1. Start the management node:

      /etc/init.d/mysql-ndb-mgm start

    2. Start your data nodes:

      /etc/init.d/mysql restart
      /etc/init.d/mysql-ndb restart

    Verifying the cluster
    You should now have the cluster up and running, you can verify this by logging into your management node and typing show in ndb_mgm.

    A word on access…

    The cluster is now set up and will replicate tables (created with the ndbcluster engine – more on that later), but that is only useful to a point. Right now we don’t have a single endpoint to direct queries to, so this direction needs to be done at the application level.

    We could take advantage of Elgg’s built in split read and writes, but this would only allow us to use a maximum of two nodes. A better solution would be to use a load balancer here such as Ultramonkey to direct the query to the appropriate server allowing us to scale much further.

    I didn’t really have time to get into this, so I am using the somewhat simpler mysql-proxy.

    1. On HHCluster1 install and run mysql-proxy:

      apt-get install mysql-proxy
      mysql-proxy --proxy-backend-addresses=192.168.2.2:3306 --proxy-backend-addresses=192.168.2.2:3306

    2. On your data nodes edit your /etc/mysql/my.cnf file. Find bind-address and change it’s IP to the node’s IP address. Also ensure that you have commented out any occurrence of skip-networking.
    3. Again on your client nodes, log in to mysql and grant access to your cluster table to a user on HHCluster1 – for example:

      GRANT ALL ON elggcluster.* TO `root`@`HHCluster1.local` IDENTIFIED BY '[some password]'

    Installing elgg

    Unfortunately as it stands, you need to make some code changes to the vanilla version of Elgg in order for it to work in a clustered environment. These changes are necessary because of the restrictions placed on us by the ndbcluster engine.

    Two things in particular cause us problems – ndbcluster doesn’t support FULLTEXT indexes, and it also doesn’t support indexes over TEXT or BLOB fields.

    FULLTEXT is for searching and is largely not used in the vanilla install of elgg, so I removed them. Equally, most indexes blobs one can live without, the exception being on the metastrings table.

    Metastrings is accessed a lot, so the index is critical. Therefore I added an extra varchar field which we’ll modify the code to include the first 50 characters of the indexed text – this is equivalent to the existing index:

    CREATE TABLE `prefix_metastrings` (
    `id` int(11) NOT NULL auto_increment,
    `string` TEXT NOT NULL,
    `string_index` varchar(50) NOT NULL,
    PRIMARY KEY (`id`),
    KEY `string_index` (`string_index`)
    ) ENGINE=ndbcluster DEFAULT CHARSET=utf8;

    And the modified query:

    $row = get_data_row("SELECT * from {$CONFIG->dbprefix}metastrings where string=$cs'$string' and string_index='$string_index' limit 1");

    Mysql’s optimiser checks the index first so this doesn’t lose a significant amount of efficiency (at least according to the explain command).

    » Modified schema

    The next problem is that the system log currently uses INSERT DELAYED to insert the log data. This is also not supported under the clustered engine.

    There are a number of approaches we could take including using Elgg’s delayed write functionality or writing a plugin which replaces and logs to a different location.

    For the purposes of this test I decided to just comment out the code in system_log().

    What won’t work
    Currently there are a couple of core things that won’t work under these changes, here is a by no means complete summary:

    • The system log (as previously described). This isn’t too much of a show stopper as the river code introduced in Elgg 1.5 no longer uses this.
    • The log rotate plugin as this attempts to copy the table into the archive engine type and we can’t guarantee which node it will be executed on in this scenario.
    • Any third party plugins which attempt to access the metastrings table directly (of which there should be none as direct table access is a big no no!)

    Anyway, here is a patch I made against the released version of 1.6.1 with all the code changes I made. Once you have applied this patch to your Elgg install you should be able to proceed with the normal Elgg install.

    Let me know any feedback you may have!

    » Elgg Clustering patch for Elgg 1.6.1

    Top image “Birds-eye view of the 10,240-processor SGI Altix supercomputer housed at the NASA Advanced Supercomputing facility.”

    Pastures new

    September 15th, 2009 by Marcus Povey

    I would like to interrupt your usual reading in order to make a brief announcement:

    I joined the Elgg team full time back in 2008 (after having worked for them as a technology consultant). I was there at the birth of the Elgg 1.0 codebase (which I helped design) and was delighted to see Elgg grow from strength to strength.

    By the time of the Elgg 1.5 release earlier in the year, Elgg had achieved widespread support and adoption and had cemented itself as the leading open source social networking framework. Now with the Elgg 1.6 release out of the door and the continued growth of the vibrant Elgg community, I have no doubt that Elgg’s success will continue into the future.

    However, I have decided that it is time to move on and explore other projects.

    So, as of the 27th October I will no longer be part of Curverider or Elgg’s core development team (although I dare say I will still end up doing a bit of Elgg coding here and there).

    I really enjoyed helping build Elgg and being part of the Elgg community, and I am excited to see where the community takes the platform next.

    As for me, I will be busily working on my next project (more on that to come) as well as continuing to provide expert consultancy services.

    Geocoding Elgg entities

    August 28th, 2009 by Marcus Povey

    Geotagging and GeoRSS support has been available in Elgg for a little while now, but like so many cool features of the platform, I haven’t really had the time to draw people’s attention to it.

    Although I am drawing your attention to it now, it should be noted that this is still somewhat under development!

    Anatomy of a geocoder

    To begin geocoding your data you will need a geocoder. This is not something that Elgg comes installed with by default, although here’s one I coded earlier.

    This geocoder users the Google maps API to do the actual encoding, and provides two primary features.

    • It handles the plugin hook “geocode”,”location”
    • It listens to all create events and attempts to tag it with the latitude and longitude – either from a ->location metadata on the object itself, or from the user’s current ->location – you could get more creative, this is a simple example.

    When you attempt to geocode a location you call the function elgg_geocode_location(). This in turn triggers the above plugin hook and attempts to encode the data.

    For efficiency once a location has been geocoded the result is cached. Future attempts to code the same location will return the result from the cache.

    Once installed and configured, new content (wire posts etc) will be tagged with a latitude and longitude. Fill in your profile location field and try it out for yourself!

    Location based searches

    Once things are tagged with a location, you can start to use location as a starting point for searching by using some of the location aware search functions in location.php.

    This hasn’t currently been hooked up to the Elgg interface in any way, but that doesn’t stop you making use of it in your plugins.

    GeoRSS

    This is something you’ll be pleased to know that you get for free!

    As you list your entities using the Elgg listing tools using the RSS view, if an entity has a position defined it will be included using standard geoRSS simple notation, i.e:

    <georss:point> 45.256 -71.92 </georss:point>

    Happy coding!

    Next Page »
    All content is © Copyright Marcus Povey 2008-2010 and released under a Creative Commons licence unless otherwise stated.

    Creative Commons License