[squid-users] CVE-2009-0801

Fri Dec 18 22:14:49 UTC 2015

On 19/12/2015 8:52 a.m., dc wrote:
> Hello,
> 
> please help me to understand the issue of CVE-2009-0801. Description of
> the CVE:
> 
> "Squid, when transparent interception mode is enabled, uses the HTTP
> Host header to determine the remote endpoint, which allows remote
> attackers to bypass access controls for Flash, Java, Silverlight, and
> probably other technologies, and possibly communicate with restricted
> intranet sites, via a crafted web page that causes a client to send HTTP
> requests with a modified Host header."
> 
> Looking at source code, to mitigate this issue, effectively
> client_dst_passthru is enforced even when client_dst_passthru is set to
> off in the configuration, when a mismatch between DNS resolved addresses
> und original request destination address is detected.
> 
> I do not really understand how a possible attack could look like, could
> you provide an example?

The problem(s):

With CVE-2009-0801 the ORIGINAL_DST signals arriving at Squid look like
TCP IP:port for some ServerA.example.net domain. But the HTTP message
contains something different like:

 GET /logo.jpg HTTP/1.1
 Host: attacker.example.com

The browser Same-Origin and sandbox Origin security protections ensure
that sandboxed scripts can only open TCP connections to the same origin
server(s) they are scoped for. But scripts can send any header values
they like, including fake Host on that connection once it is open.

If Squid were to use the Host header to route the message in any
situation where the ORIGINAL_DST details do not match the Host DNS
records. Then Squid would be fetching and delivering some content into a
browser sandbox from a server that sandbox did not permit.

The fix for that is to simply fetch form the ORIGINAL_DST server
IP:port. Acting as if the proxy were not there.

BUT ... the proxy actually is there, so the cache has to be accounted
for. That stores things by URL. This causes Vuln #2 below if we use the
Host value as domain name like it is supposed to be. And if we don't the
proxy outgoing Host header is mandatory to re-write to the URL host
portion. Meaning the outbound traffic would have raw-IP:port for the
domain name.

Vuln #2:  the attacker script can cause hijacking of popular content
URLs simply by fetching the above request from its own malicious server
with Host:google.com.
[no CVE for this bit since no published Squid was ever vulnerable].

This is not just bypassing the its own sandbox protection, but causing
its attack payload to be delivered in future to another sandbox (on its
own machine OR any other machine using the proxy) in the followup proxy
cache HITs. That payload has escaped its own origin sandbox and now runs
with whatever permissions and data access the victims domain normally
has access to (plus the ability to jump again buy hijacking any of that
sandboxes valid servers/URLs).

The fix we chose to use is not to cache anything where the Host vs DNS
is not matching.

BUT ... PAIN ... it turns out rather a *lot* of content have been using
systems where the Host and origin server DNS do not match all the time.
Everything using Akamai CDN, Google web hosting, Geo-based DNS services,
so called "client-based HTTP routing" as done by some popular AV
vendors, and many other smaller sites for odd reasons. Or at least they
dont match when the DNS is looked up through two different DNS routing
paths.

The alternative would be to use raw-IP port on the URL and outgoing Host
value. That latter is mandatory but breaks Virtual Hosting on these
messages. Given the particular providers whose actions cause the pain;
breaking virtual hosting would be far worse than not caching. The common
cases are anyway low-HIT content or major providers with high speed
networks (very low MISS latency).

So the fix we use is to verify the DNS and Host details match. Only
allowing cache storage and/or Host message routing when they do.

This still leaves us with pain in all those situations where non-match
happens. There are no fixes for that, just workarounds to iprove the
chances of matching.

PS. The two vulnerabilities have been known about since at least 1990's.
The original choice was to leave the CVE-2009-0801 behaviour happening
to improve caching (and same-origin, sandbox etc did not exist back
then). But nowdays the browser protections do exist, and in 2009 active
malware was found using the proxy behaviour to escape it. So the CVE got
allocated and we had to fix. Opening the second vulnerability was never
an option, so we are where we are now.

PPS. ideas or attempts at resolving the HIT side effect is welcome. Just
be aware that every attempt so far has lead to one or other
vulnerability being re-opened and neither is acceptible behaviour. So
having what appears to be a good idea shot down and rejected is normal
when attacking this problem (it took a good year to get the fix as far
as it is).

Amos