Category Archives: trafficserver

Stream Editor for Trafficserver

I haven’t blogged much on software of late. Well, I don’t seem to have blogged so much at all, but my techie contents have been woefully sparse even within a meagre whole.

Well, I’ve just added a new stream editor in to Apache Trafficserver.  It’s been on my to-do list for a long time to produce a similar functionality to sed and sed-like modules in Apache HTTPD.  Now I’ve hacked it up, and dropped in in to the main repo at /plugins/experimental/stream-editor/.  I expect it’ll stay in /experimental/ until and unless it gets sufficient real-world usage to prove itself and sufficient demand to be promoted.

The startingpoint for this was to duplicate the functionality of mod_line_edit or mod_substitute, but with the capability (offered by mod_sed but not by the others) to rewrite incoming as well as outgoing data.  Trafficserver gives me that for free, as the same code will filter both input and output.  Some of the more advanced features, such as HTTPD’s environment variables, are not supported.

There were two main problems to deal with.  Firstly, the configuration needs to be designed and implemented from scratch: that’s currently documented in the source code. It’s a bit idiosyncratic (I’ll append it below): suggestions welcome.  Secondly, the trafficserver API lacks a set of utility classes as provided by APR for Apache HTTPD.  To deal with the latter, I hacked it in C++ and used STL containers, in a manner that should hopefully annoy purists in either C (if they exist) or C++ (where they certainly do).

In figuring it out I was able to make some further improvements: in particular, it deals much better than mod_line_edit or mod_substitute with the case where different rules produce conflicting edits, allowing different rules to be assigned different precedences in configuration to resolve conflicts.  And it applies all rules in a single pass, avoiding the overhead of reconstituting the data or parsing ever-more-fragmented buffers – though it does have to splice buffers to avoid the risk of losing matches that span input chunks.  It parses each chunk of data into an ordered (stl) set before actually applying the edits and dispatching the edited data.

/* stream-editor: apply string and/or regexp search-and-replace to
 * HTTP request and response bodies.
 * Load from plugin.config, with one or more filenames as args.
 * These are config files, and all config files are equal.
 * Each line in a config file and conforming to config syntax specifies a
 * rule for rewriting input or output.
 * A line starting with [out] is an output rule.
 * One starting with [in] is an input rule.
 * Any other line is ignored, so blank lines and comments are fine.
 * Each line must have a from: field and a to: field specifying what it
 * rewrites from and to. Other fields are optional. The full list:
 * from:flags:value
 * to:value
 * scope:flags:value
 * prio:value
 * len:value
 * Fields are separated by whitespace. from: and to: fields may contain
 * whitespace if they are quoted. Quoting may use any non-alphanumeric
 * matched-pair delimiter, though the delimiter may not then appear
 * (even escaped) within the value string.
 * Flags are:
 * i - case-independent matching
 * r - regexp match
 * u (applies only to scope) - apply scope match to full URI
 * starting with "http://" (the default is to match the path
 * only, as in for example a <Location> in HTTPD).
 *   A from: value is a string or a regexp, according to flags.
 *   A to: string is a replacement, and may reference regexp memory $1 - $9.
 *   A scope: value is likewise a string or (memory-less) regexp and
 *   determines the scope of URLs over which the rule applies.
 *   A prio: value is a single digit, and determines the priority of the
 *   rule.  That is to say, two or more rules generate overlapping matches,
 *   the priority value will determine which rule prevails.  A lower
 *   priority value prevails over a higher one.
 *   A len: value is an integer, and applies only to a regexp from:
 *   It should be an estimate of the largest match size expected from
 *   the from: pattern.  It is used internally to determine the size of
 *   a continuity buffer, that avoids missing a match that spans more
 *   than one incoming data chunk arriving at the stream-editor filter.
 *   The default is 20.
 *   Performance tips:
 *    - A high len: value on any rule can severely impact on performance,
 *      especially if mixed with short matches that match frequently.
 *    - Specify high-precedence rules (low prio: values) first in your
 *      configuration to avoid reshuffling edits while processing data.
 *  Example: a trivial ruleset to escape text in HTML:
 *   [out] scope::/html-escape/ from::"&" to:"&amp;"
 *   [out] scope::/html-escape/ from::< to:&lt;
 *   [out] scope::/html-escape/ from::> to:&gt;
 *   [out] scope::/html-escape/ from::/"/ to:/&quot;/
 *   Note, the first & has to be quoted, as the two ampersands in the line
 *   would otherwise be mis-parsed as a matching pair of delimiters.
 *   Quoting the &amp;, and the " line with //, are optional (and quoting
 *   is not applicable to the scope: field).
 *   The double-colons delimit flags, of which none are used in this example.

Traffic Server Summit (by ‘net)

I spent two days last week at the trafficserver summit.

Or rather, two evenings.  The summit was held in Silicon Valley (hosted by linkedin), while I remained at home in Blighty with a conferencing link, making me one of several remote attendees.  With an 8 hour time difference, each day started at 5pm and went on into the wee hours.  On the first day (Tuesday) this followed a day of regular work.  On the Wednesday I took a more sensible approach and the only work I did before the summit was a bit of gardening.  Despite that I felt more tired on the Wednesday.

The conferencing link was a decent enough instance of its kind, with regular video alongside screen sharing and text (though IRC does a better job with text).  The video was pointed at the speakers as they presented, and the screen sharing was used to share their presentations.  That was good enough to follow the presentations pretty well: indeed, sometimes better than being there, as I could read all the intricate slides and screens that would’ve been just a blur if I’d been present in the room.

Unfortunately most of the presentations involved discussion around the room, and that was much harder, sometimes impossible, to follow.  Also, speaking was not a good experience: I heard my voice some time after I’d spoken, and it sounded ghastly and indistinct, so I muted my microphone.  That was using just the builtin mike in the macbook.  I tried later with a proper headset when I had something to contribute, but alas it seems by then I (and I think all remote attendees, after the initial difficulties) was muted by the system.  So I had something approximating to read-only access.  And of course missed out on the social aspects of the event away from the presentations.

In terms of the mechanics of running an event like this, I think in retrospect we could make some modest improvements.  We had good two-way communication over IRC, and that might be better-harnessed.  Maybe rather than ad-hoc intervention, someone present (a session chair?) could act as designated proxy for remote attendees, and keep an eye on IRC for anyone looking to contribute to discussion.  Having such a person would probably have prompted me into action on a few occasions when I had a comment, question or suggestion.  Or perhaps better, IRC could be projected onto a second screen in the room, alongside the presenter’s materials.

The speakers and contents were well worth the limitations and antisocial hours of attending.  I found a high proportion of the material interesting, informative, and well-presented.  Alan, who probably knows more than anyone about Trafficserver internals, spoke at length on a range of topics.  The duo of Brian and Bryan (no, not a comedy act) talked about debugging and led discussion on test frameworks.

Other speakers addressed applications and APIs, and deployments, ops and tools.  A session I found unexpectedly interesting was Susan on the subject of how, in integrating sophisticated SSL capabilities in a module, she’s been working with Alan to extend the API to meet her needs.  It’s an approach from which I might just benefit, and I also need to take a look at whether Ironbee adequately captures all potentially-useful information available from SSL.

At the end I also made (via IRC) one suggestion for a session for the next summit: API review.  There’s a lot that’s implemented in Trafficserver core and utils that could usefully be made available to plugins via the API, even just by installing existing header files to a public includes directory.  Obviously that requires some control over what is intended to be public, and a stability deal over exported APIs.  I have some thoughts over how to deal with those, but I think that’s a subject for the wiki rather than a blog post.  One little plea for now: let’s not get hung up on what’s in C vs C++.  Accept that exported headers might be either, and let application developers deal with it.  If anyone then feels compelled to write a ‘clean’ wrapper, welcome their contribution!


Trafficserver 4

I meant to blog this upwards of a week ago, but I guess better late than never – at least when the subject isn’t so topical to the moment as to go instantly stale.

Apache Trafficserver 4.01 was released on August 30th.  It’s basically a production release of what has hitherto been the developer (unstable) series 3.3.x.  It’s actually also an incremental upgrade from earlier 3.x releases, in that existing users should be able to upgrade to 4.0 as a drop-in replacement or with very minimal reconfiguration, though of course test before deploying in production!  And if you use third-party add-ons, check with their developers or support.

Ironbee, the leading WAF and the add-on with which I’m substantially involved, has always tracked Trafficserver development versions, and is thus ready for Trafficserver 4.  Users are encouraged to upgrade as soon as you are ready, and subject of course to the general testing you would always apply to a change of platform.  If you find any issues arising, you are encouraged to raise them in the relevant fora for Ironbee and/or Trafficserver.

Please note, although I work on both the Trafficserver and Ironbee projects, I don’t speak on behalf of either of them when I blog.  None of the above is in any sense official.

Calling home: fatal?

Was asked if I could help solve a proxying problem this evening.  My provisional diagnosis raises a couple of issues of interest, and it would be good to confirm whether my diagnosis makes sense.  Any Iphone or Android users out there should be able to say whether it’s plausible.

It started with a request: did I have an iphone or ipad, or possibly Mac (the latter in case it was something Apple-specific).  Users have been unable to view pages through the proxy, but we have no detailed explanation beyond “doesn’t work”.  Yes I have a mac, but it’s not here: is this a problem I can go away and look at?  Or, why don’t I fire up Konqueror, the KDE browser that uses the same khtml engine as Apple?  What URL should I try to see if I can reproduce the error?

This is where it gets interesting.  The purpose of the project is to run a reverse proxy, but to test it I had to configure it as forward proxy for Konq and navigate to a test URL.  It all worked fine, but the forward proxy is a test-only setup and blocks all but a selected whitelist of sites.

OK, next tack, can I see what’s happening if I have ssh access to the proxy itself?  Trafficserver’s logs are in squid format (with which I am unfamiliar) and show ERR_CONNECT_FAIL when the errors occur.  Looking that message up, I find it should just mean Trafficserver was unable to contact the origin server.  By about this time it’s also been established that Android clients have the same problem.

Reading the log, I’m guessing the clients having trouble are trying to “phone home”, so to test this I generate a couple of requests using Lynx through the proxy: one to a proxied site, the other to Google.  This confirms my suspicion: the google request (which is blocked) generates precisely the log entry associated with failed requests.  It also helps clarify my reading of the squid-format logs, and confirms that the iphone and android clients’ failed requests are in fact to Google URLs.

So my question to iphone and android users: would a failed call-home request (to Google) throw an error that would prove fatal to a regular page loading from elsewhere?  That seems rather bizarre, though not really more so than Google maps/satnav refusing to work at all without a live data connection.

If that is indeed the problem, it still doesn’t explain why the problem should arise in normal use, when it’s a reverse proxy and the google connection is direct.  Looks like either the problem is in fact on someone else’s network (combined with dumb browser design), or the messages seen in Trafficserver’s logs are a complete red herring and unrelated to the problem.  Hmmm.