Ok, here’s a quick one.

I recently had the Ubuntu upgrade popup kick in on my desktop computer, prompting me to upgrade to Ubuntu 20.04. It was a rainy afternoon, so I went ahead.

Things seemed to go fine (even my dual screen Nvidia setup worked), but I could no longer log in from either the desktop or ssh. The local root account was fine, but none of the network users could log in.

So, this was an NIS / ypbind problem.

I logged in as root and ran ypwhich, which reported it could not connect to ypbind. However, networking was working, and I could ping the NIS server.

Running /etc/init.d/nis restart didn’t do anything, but when I ran ypbind manually, all of a sudden I was able to log in.

Ubuntu likes to change their startup scripts as often as politicians like to change their faces, so I wasn’t overly surprised. Most likely the startup order had changed, and maybe NIS was being brought up before networking was initialised.

My quick and dirty solution

Ok, so I’m getting old, and I don’t enjoy this as much as I once did. I’d rather not spend the entire day deep in the bowls of upstart or systemd or whatever the new thing Ubuntu is using today.

So, this isn’t the correct solution, but it works.

  1. Log in as root
  2. Edit the crontab: crontab -e
  3. Add this line: @reboot /usr/sbin/ypbind &

Save and reboot, and you should be able to log in.

Yes, I know this is dirty, but honestly life is too short!

tl:dr – Kernel headers for 5.3.0 have changed so modules for older Nvidia cards no longer build. Downgrade to the last known good kernel (e.g. 5.0.0-37-generic) and you should be good.

So, I woke up this morning (blues riff), extra early in order to bash out some client work before heading to the gym. I turned on my computer and was greeted by the sight of my login screen, low res, and only on one monitor.

Nothing with computers, it seems, is going to be easy.

I remembered that I had done an apt-get upgrade the previous night, and this has in the past knocked the Nvidia drivers out of whack, so I reinstalled the drivers for my card apt-get remove nvidia-driver-390; apt-get install nvidia-driver-390 and restarted.

No joy.

Fine, I’ll install the official drivers from the Nvidia binary. Never failed before.

Bang. Wouldn’t build.

Ok. Time to dig a little deeper.

I went back to the distro drivers, and this time removed them completely; apt-get remove nvidia-driver-390 --purge; apt-get autoremove; apt-get install nvidia-driver-390.

Still no joy.

However, picking apart the build logs, and we have our first clue. A bunch of build errors to do with the Nvidia modules. Seems that the drivers were not able to build against the current kernel.

Looking at my /boot/ I can see that a new kernel (5.3.0) was installed as part of the upgrade. It looks like the kernel headers have been changed about, and this has broken older drivers.

So, there are two possible solutions – use a newer version of the Nvidia drivers (which isn’t possible for me since I have a pretty old GForce card installed), or roll back to the previous kernel.

First, remove the 5.3.0 kernel: apt-get remove linux-image-5.3.0-26; apt-get remove linux-headers-5.3.0-26

You’ll get a warning if you’re currently running this kernel, don’t worry, we’ll sort that out now.

Make sure you’ve got the working kernel installed apt-get install linux-image-5.0.0-37 linux-headers-5.0.0-37

Now, make sure grub boots this module. Edit /etc/default/grub and change GRUB_DEFAULT to GRUB_DEFAULT="Advanced options for Ubuntu>Ubuntu, with Linux 5.0.0-37-generic" (or whatever the particulars of your last working kernel was – ls -larth /boot to show the history).

Update your boot, update-grub

Reboot, and then log in to the console, and reinstall your Nvidia drivers: apt-get remove nvidia-driver-390 --purge; apt-get autoremove; apt-get install nvidia-driver-390

All being well, the Nvidia drivers should build this time. Reboot one last time, and you’re good!

Hope this helps.

UPDATE Jan 21

If you are running Focal or later, you may find that this stops working, and reinstalling the drivers as above doesn’t resolve the problem. This is because the 5.0.0-37 headers have been removed from the repo.

The solution I found was to install the headers (generic, and all) from the Bionic (18.04 LTS) repo.

Today was a very frustrating day.

So, yesterday, I did a rollup of software on my main work machine. I performed an apt-get upgrade as I have done a thousand times before. Logged off, and went to bed.

This morning, when tried to log on, after I’d entered my details on the login screen, I was greeted by a blank screen for about 2 minutes, before being kicked back to the login screen.

Hmm..

This kind of thing had happened before, and in the past it was just a matter of installing the vendor NVidia drivers for my card. Sometime back in the day the distro provided nvidia drivers had stopped working, so using the vendor ones was the way to go (this has since changed).

No joy. So, I began diving in and pulling at the various ends in an attempt to unravel this knotted ball of string.

Watching the logs, I noticed that just before the login process got thrown back to lightdm, I got a bunch of…

Activated service 'org.freedesktop.systemd1' failed: Process org.freedestop.systemd1 exited with status 1

…appearing in my syslog.

So, something was suddenly up with systemd, but I was nonthewiser.

My setup at home is that I have a server which has my home directory and users exported by NFS/NIS to various machines, so there was nothing actually on the work machine. Sod it I thought, nuke the site from orbit. So, I reinstalled, just in case I had bawked something up over the years.

The fresh install made me create a new user, fine. I installed all the graphics drivers, and was able to log in just fine. Great! So, installed the various bits of software, set up NIS/NFS, could log in on console… great! Logged in through gdm3… aaaand. Nothing. Same error. Switched to lightdm. Same thing.

But… the local user worked. Must be something in my user’s home dir, after all that. So, unmounted my home directory, and tried to log in as a fresh user… still no joy. But the local user could log in…

Hmmm….

Lightbulb!

As a hunch, I copied the user line from my server’s /etc/passwd into my local machine’s /etc/passwd… and bingo, I was able to log in.

So, what looks like has happened is that a recent change (within the last week or so) has broken NIS user support for systemd/dbus. So, when the window manager was trying to start the services it needed to run, it wasn’t able to, since the user it was attempting to use couldn’t be found. Lightdm/Pam still functioned with NIS, so my thinking is that there’s something about the environment that’s looking directly at /etc/passwd for something, or to validate uids.. I’m not an expert.

So, if any of you are in a similar situation, hopefully this blog post will stop you from losing an entire day of work!

My askubuntu ticket is over here, and I’ll keep updated should I find a better solution than this rather crufty hack.