Category Archives: site valet

Virtualised!

My slice is up-and-running, and all major services appear to work, though there’ll doubtless be glitches.  Yesterday I updated DNS to point to it, after Richard contacted me to say they’re clearing out the old datacentre over the weekend.  This morning, DNS has propagated to my ISP and no doubt much of the ‘net, so next time you contact anything I run, it’ll be on the new slice.

There were a couple of minor panics in setting it up, when things didn’t compile first time.  libhtnorm (the backend for AccessValet) was an unexpected scare when it showed a bunch of unresolved C++ symbols.  But it turned out to be just the linker that was different, so it worked fine when I explicitly loaded libstdc++.so.  Other Site Valet tools required some very minor troubleshooting, but only at a sysop level (no programming).  My main fear proved unfounded as mod_validator required nothing more exciting than the latest Xerces package and an OpenSP build with the right options.  ApacheTutor also needed some trivial work, to compile mod_xmlns against the expat version installed on the new slice.

I’m still thinking about how best to add a note to pages served, and invite users to report anything that’s broken in the move.  Of course I can use mod_publisher to insert a notice, and the stumbling block is to work out the page design with the notice in for each site affected.  All my sites need an overhaul anyway.

Anyway, it’s farewell to Openia, who have done a great job hosting the server over several years.  A special thanks to them for sponsoring it when WebThing was struggling with no money.

Advertisements

Link Valet timeout

Link Valet, the Site Valet tool for checking HTML links, has always been a somewhat dangerous tool to run.  Making lots of HTTP requests to external servers is inherently open to abuse, and particularly to a DoS attack on my server.  That’s one reason I haven’t released the script: I can react to problems on my server, but don’t want to take on the responsibility for what might happen elsewhere.

I’ve made a few adjustments to restrict it over the years: reducing the timeout for checking a link, reducing the level of recursion available, denying access to known abusive ‘bots.  But in the last few days, someone’s been abusing it to the point of overloading the server.  So I’ve put in an additional limit: a timeout on the whole thing.  That is, a graceful timeout: the script will complete, but will not check any more URLs after the timeout.

So far, it seems to be working:-)

Things that come around

Several years ago, I tried proposing to Google that they should incorporate accessibility analysis into their search rankings. Their (eventual) reply was, not interested.

I’ve just heard the BBC’s In Touch program, which deals with issues affecting blind and partially-sighted people. Today we had a lengthy interview, with a blind Indian engineer working at Google on exactly that problem. He explained that the accessibility-enhanced search will as first priority select the best/most relevant pages by google’s standard closely-guarded-secret algorithms, but then order those results to ensure that the highest-placed results are accessible.

He even gave some technical details of how the accessibility assessment works. The perennial subject of alt attributes was mentioned (without details on how they assess them), but more interestingly, he referred to well-structured pages, and clearly uses HTML heading markup as a criterion.

It’s all happening very quietly, but it’s gratifying to those of us who have been banging on about this for years. Of course, it would’ve been far better if they’d used Site Valet (customised as necessary to integrate with their systems) for this analysis.