Transcoding module

One of the new features in mod_proxy_html 3.0 is improved i18n support, adding character sets supported by apr_xlate (normally iconv) to those supported by libxml2.

In generalising this for other filter modules, I’ve decided to split it out into a new transcoding module. It will be tied to libxml2 applications, and will be usable both before and after any libxml2-based content filter. For maximum efficiency, it will only handle charsets that are not supported by libxml2.

It will also support additional preprocessing fixups that experience has shown necessary. That includes adjusting charset declarations that are invalidated by transcoding, and fixing tag-soup problems that screw up libxml2’s htmlParser.

It won’t do anything useful yet, but I’ve committed mod_xml2enc as a work-in-progress to svn at apache.webthing.com. When ready, it’ll borrow from several existing modules, and replace transcoding and preprocessing functions in them.

Posted on December 18, 2007, in apache, html, xml. Bookmark the permalink. 1 Comment.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: