search | Marcus Povey

Sitemaps are specially crafted XML files, usually located at https://yourdomain.com/sitemap.xml, that help search engines better crawl your site.

It came up in conversation on IRC that there was a need for a sitemap plugin for Known, and because such a plugin would be useful to myself as well as others (and because I had a little bit of time while waiting for a painfully slow set of Vagrant builds, so I thought I’d put something together.

So, over on github, I’ve put together a quick plugin that will automatically generate a basic sitemap plugin for your site, as well as update your robots.txt accordingly.

When you first visit your sitemap.xml file a sitemap will be generated and cached. When you create new posts, this file will be automatically updated.

It’s pretty simple at the moment, but as usual, pull requests are welcome!

» Visit the project on Github...

The Domain Name System – which much of the internet is built on – is a system of servers which turn friendly names humans understand (foo.com) into IP addresses which computers understand (111.222.333.444).

It is hierarchical and to a large extent centralised. You will be the master of *.foo.com, but you have to buy foo.com off the .com registrar.

These top level domain registrars, if not owned by national governments, are at least strongly influenced and increasingly regulated by them.

This of course makes these registrars a tempting target for oppressive governments like China, UK and the USA, and for insane laws like SOPA and the Digital Economy Act which seek to control information, and shut down sites which say things the government doesn’t like.

Replacing this system with a less centralised model is therefore a high priority for anyone wanting to ensure the protection of the free internet.

Turning text into numbers isn’t the real problem

It may not be an entirely new observation here; the problem of turning a bit of text into a set of numbers is, from a user’s perspective, not what they’re after. They want to view facebook, or a photo album on flickr.

So finding relevant information is what we’re really trying to solve, and the entire DNS system is really just a factor of search not being good enough when the system was designed.

Consider…

Virtually all modern browsers have auto complete search as you type query bars.
Browsers like Chrome only have a search bar
My mum types domain names, or partial domain names, or something like the domain name (depending on recollection) into Google

For most cases, using the web has become synonymous with search.

Baked in search

So, what if search was baked in? Could this be done, and what would the web look like if it was?

What you’re really asking when you visit Facebook, or Amazon or any other site is “find me this thing called xxxx on the web”.

Similarly when a browser tries to load an image, what it’s really saying is “load me this resource called yyyy which is hosted on web server xxxx on the web”, which is really a specialisation of the previous query.

You’d need to have searches done in some sort of peer to peer way, and distributed using an open protocol, since you’d not want to have to search the entire web every time you looked for something. Neither would you want to maintain a local copy of the Entire World.

It’d probably eat a lot of bandwidth, and until computers and networks get fast enough, you’d probably still have to rely on having large search entities (google etc) do most of the donkey work, so this may not be something we can really do right now.

But consider, most of us now have computers in our pockets with more processing power than existed on the entire planet a few decades ago; at the beginning of the last century the speed of a communication network was limited by how fast a manual operator could open and close a circuit relay.

What will future networks (and personally I don’t think we’re that far off) be capable of? Discuss.

Marcus Povey

Time, Space, and Plexiglas

Tag Archives: search

Sitemap generation for Known

DNS is a symptom of broken search #sopa

Turning text into numbers isn’t the real problem

Baked in search