What a difference a year makes… and I ain’t dead (yet). Long time, no post. Suffice it to say it has been a very busy year, with a lot going on, both professionally and personally, both good and bad. But, I won’t go into that here.

AI is the new hotness, as has been remarked. A lot of my day job now involves AI in some form or another, and indeed, the field of Structural Biology leads the way on a lot of the technological adoption. Notably, things like AlphaFold for AI based structural predictions, but also, some of the stuff I’m more directly involved with which aims to build a model for improved fragment based drug discovery. More on that stuff in due course.

Compared to that, this little thing I put together while bored during a meeting today may not be earth shattering, but as a proof of concept I think it does start to address a problem we on the ARIA team have faced for a little while.

Namely, our catalogue is very well placed to allow people who know what they’re after to get what they want, but not if they want to, for example, find something out but have no idea how to go about it.

Previously, we had used human experts, and even played with a live chat feature. None of this really scaled very well, but now we’ve got fancy pants AI, I figured I’d give it a try!

My thinking on this is fairly simple, the first step is to make sure all the services in ARIA have good descriptions, including down to what each machine can and can’t do. The better the description, the better the results from the model.

The second is to feed this context into the prompt, and instruct the AI (via the new fangled technique of prompt engineering) to act as an expert and set the parameters according. For example, to instruct to only recommend information in the context, and to provide, as a result, a list of suitable application URLs (which in reality will allow the user to directly apply for the service).

I also instructed the AI to consider how it could connect services together into a pipeline, as well as to consider alternative recommendations.

Proof of concept right now, but already I’m getting possibly interesting results. I’ve got more tedious meetings coming up, as well as a fair amount of sitting in airports, so my plan is to wire this up with a chat interface and let real scientists play. No doubt we’ll have to tweak the prompt a number of times as we go, but as a first play with the technology I think this was a morning well spent!

» Visit the project on Gitlab...

Writing this here, since it caused some issues at the Day Job, and took a little bit of work to debug. Hopefully this will save some of you some time, and will jog my memory should something like this happen again.

Anyway, last week, one of my team wanted to deploy a new release of our Access Management System (ARIA), which involved deploying a bunch of containers. The release procedure worked fine, however after the deploy, the main web app container was entirely unable to talk to our API layer.

My team did a little bit of debug, but it was at this point that it got escalated to me. I rolled the live environment back, and began debugging the problem.

On the face of it, it seemed that networking, or at least name resolution, was no longer working from within the container. A curl call from the command line produced:

curl: (6) getaddrinfo() thread failed to start

However, a connection to an IP address would work. So, I began looking at networking / name resolution. The next step was to see what the name servers were doing… however, nslookup gave me:

isc_thread_create(): fatal error: pthread_create(): Operation not permitted

Interesting… so something was blocking creating new threads within the container. Likely the security model that docker was running… not sure why this would change, but I confirmed this by redeploying with SECCOMP turned off:

security_opts:
      - "apparmor:unconfined"
      - "seccomp:unconfined"

Confirmed, networking was working.

Not sure what’s changed, but it would appear that somewhere down the line the base Apache Linux image has updated, and is now using a different system call for starting threads. Likely a new version of GLIBC has been rolled into the container somewhere.

The final fix was to update the various containers to make sure they were all running on the newer base image, and then to redeploy docker on our estate so that it was running the latest version.

Boom.

Everything back to normal, hope this saves you some time!

Molgenis is an open-source platform for scientific data management and research. The name “Molgenis” is derived from “Molecular Genetics Information Systems.” It provides tools for researchers to design, capture, and share data in the field of molecular genetics and other related areas.

Molgenis is designed to facilitate the handling of large-scale, complex datasets in genomics and other biomedical research domains. It offers features such as data integration, data modeling, and data management. Researchers can use Molgenis to create databases, design forms for data entry, and perform data analysis.

At The Day Job, we’re using it as part of our oncology research infrastructure project to act as a source of truth for certain system information as we build out a distributed access platform to help scientists and doctors conduct their research.

Anyway, at time of writing, there wasn’t a PHP client library for it, so I quickly put one together. Have fun!

» Visit the project on Gitlab...