[squid-users] Identifying the source of Invalid-request (squid 3) -> error:transaction-end-before-headers (Squid 4)

Sat Oct 15 17:57:38 UTC 2016

On 10/15/2016 07:36 AM, Jester Purtteman wrote:
> I have been seeing lines in my access log like the following:  
> 
> 1476535967.570      0 xxx.xxx.xxx.xxx TAG_NONE/400 4538 NONE
> error:invalid-request - HIER_NONE/- text/html
> 
> After some digging on this list I began to suspect websockets or other
> non-http traffic coming across port 80.  I decided to try squid
> 4.0.15 with on-unsupported-protocol.  I get what I am guessing to be the
> same result with new error text around it:
> 
> 1476536369.742      0 xxx.xxx.xxx.xxx NONE/000 0 NONE
> error:transaction-end-before-headers - HIER_NONE/- -
> 
> An interesting point to interject here is that my “Hits as % bytes sent”
> in 3.5.x has always been in the 2 to 5% range, but there are periods
> (sometimes long ones) where the inbound traffic to squid is much higher
> than the outbound.  When I switch to 4.0.x, I am now running about -27%
> (note, negative twenty-seven) as bytes, which makes me suspect it is
> logging the higher inbound than outbound now.

That difference sounds potentially important to me. I encourage you to
figure out what causes it (which is exactly what you have started doing,
of course).

> So, apparently, this
> unsupported protocol is triggering some sort of large download, but does
> not end up going to the client.

There might be some exceptions, but non-tunneled
error:transaction-end-before-headers are not supposed to trigger a
download. Squid does not know what to download because Squid cannot
parse the request...

When enabled, tunneled error:transaction-end-before-headers do download
data, of course, but you may be able to measure how much they download
then by finding the corresponding transactions in access log.

> I would like to know a couple things, first: is there some debugging
> level other than ALL,9 that might give me some illumination? 

I am sure there is. Once you know what the problem/cause is, it is easy
to come up with the corresponding optimal debug_options settings to show
the cause. Before that? You can try various settings (debugging sections
are semi-documented in doc/debug-sections.txt), but it is often not
worth your time.

> ALL,9
> generates about 15 MB of debug log per second at my current load level,
> and these errors aren’t real frequent, so I end up with ~ 400 MB of text
> that needs to be sifted through.  As you can imagine, that can be a bit
> brutal.  

I do not quite understand the problem of a 400MB ALL,9 cache.log. IMHO,
it is not much more difficult to deal with than a 1MB ALL,9 cache.log:
Either you can navigate ALL,9 noise or you cannot; the total log size
does not really matter much beyond a few MB levels IMO (provided you
have enough disk space to store it and logging itself does not slow
Squid down too much to reproduce the problem).

Please note that I am not saying that you are doing something wrong or
even complaining about a non-problem. I am only saying that I do not see
a [solvable in the context of this email thread] problem with ALL,9 logs
so I cannot help you solve it.

> So, I have a few questions I guess: 
> 
> (1)    For one thing, what are the implications of
> “on_unsupported_protocol tunnel all”?  

In rough terms, everything that is not SSL or HTTP will be tunneled.
AFAIK, non-HTTP inside SSL will not be tunneled yet (there is an
important patch for that going through squid-dev review right now).

> I did it as a quick attempt to
> see if that had any new and interesting impacts, but is it safe-ish?

I do not know what you mean by "safe", but, in a sense, it is more
"safe" than having no proxy at all because your access.log will show you
those tunnels.

> Am I letting the bad-guys come pouring through with that?

I personally do not know -- in general, it depends on the bad guys in
your environment. Others here may have deployment-specific stories that
I lack.

> (2)    What debug levels should I be thinking about to try and figure
> out what is happening.  Seems like we won’t get very far without
> identifying what is throwing that error.

If you do not want to deal with ALL,9, I would recommend this combination:

* a packet capture (you can limit the captured packet size if needed)
* access log format that logs all IPs and all TCP ports so that you can
match an access log line with captured packets/connection.

> (3)    Has anyone else seen this?  Right now, for example (after 10
> minutes of typing an email) I’m actually running -61% Hits as Bytes! 
> (Negative!)  Ouch!

As I said earlier, I am not sure the negative byte hit ratio is actually
related to these errors, but it could be. Squid v4 fixed a few
size-related accounting bugs. It is possible that we screw something up
in the process or that your actual byte hit ratio was always bad (but
you did not know about it because Squid was lying to you).

Can you compare Squid-reported numbers with OS/interfaces-reported
numbers somehow? If OS/interface numbers confirm v3.5 report but not
v4.0 report, then there is a bug we need to fix.

HTH,

Alex.