Beware bfu and other gotchas!
More grief with OpenSolaris.
My SXCE seemed a decent, stable desktop platform. Until, that is, I installed crossbow – experimental support for advanced networking, including a complete virtual network within a box.
Crossbow installation went according to the instructions. I think a message or two flashed past, but basically it’s what TFM had led me to expect. On next boot I got quite a few more warnings, but I think that’s at least in part because I’d upped the verbosity and was seeing debug stuff that wouldn’t otherwise hit dmesg.
The network-inna-box worked. But various other things had gone. Strangely, punchin VPN still worked, but it had uninstalled my certificate – not hard to fix once I’d run it outside the GUI and got the error message. And I could live with trivialities like the loss of a system alert beep. But then I found some of my most-used tools broken: gdb just segfaults, and svn failed with an incomprehensible error message (reinstalling svn from source gave a different message). Looks like incompatible core system libs somewhere, and google finds nothing relevant. Ouch!
Chatting to Matt on IRC, it seems the culprit was the installer, bfu. bfu creates a binary install image tied to an exact system release. On a later release, as I had been using, it installs but leaves the system in an indeterminate state. Bah!
Last weekend I backed up the home directory, and reinstalled. Not just reinstalled, but spent many hours downloading updated versions: the latest SFW tarball took the longest, at less than 20K/sec download rate – compared to over 200K for most downloads. And I reinstalled a lot of system software.
One upgrade was Sun Studio 12. The installer told me I had insufficient disc space(!) – seems the system install had put /opt on the root partition, and it wasn’t big enough. So I tried an alternative install path, and it installed. OK, fine.
Then my big mistake. After some build tools had failed to find it, I tried creating a symlink. Guess I must’ve used “ln -s /my/install/path/SUNWspro /opt/SUNWspro” but there was already an /opt/SUNWspro (dunno whence), so the symlink was – uselessly – at /opt/SUNWspro/SUNWspro. OK, need to move the directory before I can add the right symlink: something like
mv SUNWspro SUNWspro-bak
mv SUNWspro-bak/SUNWspro .
At this point, something very strange happened. Instead of moving the symlink, it started copying. Looking like one of those modern mv versions that uses cp and rm when asked to move across partitions. Except – I’d only asked it to move a few bytes of symlink, and only within one partition, hadn’t I? df confirms that something is going on, and I abort the mv before it’s filled the root partition.
Now I have a system where I can neither uninstall nor reinstall sunstudio. The uninstaller is a dangling symlink, and the installer complains of a partial installation. Well, actually it just reports failure, but trying it in another directory just gives me “Installation directory does not match directory of current partial installation“. Googling the exact error message gets me this page, which looks almost like it could be relevant. pkginfo -p shows nothing, and the prodreg tool shows me a damaged sunstudio installation, but throws a java exception when I try uninstall (it’s the dangling symlink – turns out prodreg’s uninstall is just a wrapper for that). pkgrm also fails with a “no package associated with <SPROsslnk> ” message.
Right. Time to cut my losses, and reinstall again from that DVD, and override its default partition sizes this time. But at least I can skip the big downloads this time Backed up the newly-downloaded stuff even as I blogged …
[update] I may have been on the point of giving up too soon. The prodreg tool has a lower-level option than using the package-supplied uninstaller (the broken symlink). That seemed to uninstall, whereupon reinstalling once again tells me it failed, but leaves me an installation that prodreg thinks is OK, and that seems (so far) to work …