I’ve just announced a public dev version of mod_proxy_html, incorporating a range of updates. That means it works nicely for me, and I’d like the outside world to start test-driving it.
First, there’s much better internationalisation support.
- A charset not supported by libxml2 can be aliased to a supported one.
- A charset that is neither supported directly nor aliased will be converted to unicode using apr_xlate (an iconv wrapper).
- A default input encoding (for totally unlabelled contents) can be configured.
- Output can be filtered through apr_xlate to a server admin’s desired encoding.
Second, support for rewriting proprietary HTML variants is now configurable. Indeed, the definitions of all link and event attributes is now delegated to httpd.conf, and an example configuration is supplied, defining the links and events in W3C HTML 4.01 and XHTML 1.0.
When I announced it here I got two requests, one of which was easy to satisfy. You can now override its refusal to run when not in a proxy context, or when the input isn’t HTML. This of course is at your own risk, to help dealing with broken backends.
This is one of a number of new fixes available for broken backends. Others include an option to ignore leading junk, and the capability to strip out bogus or deprecated markup and output cleaned up HTML or XHTML.
Finally, Version 3 introduces more flexible configuration. It now supports variable interpolation in ProxyHTMLURLMap rules, and allows an additional clause making application of individual rules conditional on an environment variable. So configuration can now be dynamic – e.g. driven by mod_rewrite – when <Location> / <LocationMatch> sections aren’t sufficiently flexible.