[squid-users] Header order in squid proxy

Alex Rousskov rousskov at measurement-factory.com
Thu Jun 22 20:06:22 UTC 2017


On 06/22/2017 12:54 PM, Sonya Roy wrote:
> The sites I am talking about check the User-Agent header and makes sure
> the user-agent is for a well-known browser, i.e. a browser that they
> support. And any browser like Firefox, Chrome, Safari, Edge for example,
> sends the headers in a certain order and the order depends on the
> browser. And this header order for well-known headers like Accept,
> Accept-Language, Accept-Encoding, Content-Length, Host, Connection,
> Referer, Cookie, etc. And they match the order of the received request
> with the standard header order for the browser for that user-agent.

FWIW, Connection and possibly some "etc." headers are hop-by-hop headers
so if a blocking site really pays attention to them, it should be told
to exclude them.


> Could you point me in the direction to where I should look for in the
> source code of squid?

The answer depends on whether you want to:

A) prevent pointless edits (difficult and less effective but has a
fighting chance of official acceptance because it is a useful
performance optimization) or

B) simply reorder all the fields just before sending them, based on a
User-Agent field-driven order table (easy to hack in and effective but
less likely to be officially accepted due to performance overheads and
configuration/support complexities).

If you want a general vague answer, search for calls to non-const
HttpHeader methods like HttpHeader::delByName() and
HttpHeader::insertEntry(). There are about 20-30 potentially relevant
methods AFAICT. And examine the sending code in
HttpStateData::httpBuildRequestHeader().


Please note that the discussion about Squid code belongs to squid-dev,
not squid-users.


HTH,

Alex.


> On Fri, Jun 23, 2017 at 12:02 AM, Alex Rousskov wrote:
> 
>     On 06/22/2017 11:49 AM, Sonya Roy wrote:
> 
>     > I noticed that squid changes the header order received from the client
>     > before sending it to the origin server.
>     >
>     > I assume this is because squid parses the header data and adds some
>     > headers depending on the config file and then recreates the header data.
> 
>     IIRC, modern Squids change a header field position when the received
>     field is deleted and then added back. This is typical for hop-by-hop
>     headers such as Connection, but there are other reasons for Squid to
>     delete and add a header field. When the value of the added field is the
>     same as the value of the removed field, such pointless "editing" looks
>     like mindless "reordering" to the outside observer.
> 
>     The two actions (field deletion and addition) may happen in a single
>     piece of code or may be separated by lots of code and even time.
>     Preventing pointless editing in the former cases is straightforward, but
>     the latter cases are difficult to handle. Correct avoidance of pointless
>     editing may improve performance and, if it does, can be considered a
>     useful optimization on its own, regardless of your use case.
> 
> 
>     > Is there any way to prevent this?
> 
>     Not without changing Squid code (or adding more proxies). However,
>     before we even talk about code changes, we should clarify the problem we
>     are dealing with. The questions below will guide you.
> 
>     It is probably much easier to ensure some fixed field send order
>     (regardless of the received order) than to preserve the received order.
>     Will a fixed order (e.g., always alphabetical) address your use case?
>     This feature will hurt performance, but you might be able to convince
>     others to accept it if you have a very compelling/specific/detailed use
>     case because it can be disabled by default.
> 
> 
>     > I am asking because some sites detect bots using the header order and
>     > they drop any such connection. So they unintentionally block squid
>     > proxies even if its not being used by a bot.
> 
>     Are you implying that bots often change header field order between their
>     requests? Or that bots often use a different (fixed) header field order
>     than the (fixed) field order used by non-bots? Preserving received order
>     may help in the former case but not in the latter case.
> 
>     Also, do those blocking sites pay attention to all headers or just
>     end-to-end headers?
> 
>     Please note that there are many other ways to detect a proxy so if a
>     site wants to block proxies rather than bots, then it is probably
>     pointless to fight it (or, at least, the Squid Project should not).
> 
> 
>     HTH,
> 
>     Alex.
> 
> 
> 
> 
> _______________________________________________
> squid-users mailing list
> squid-users at lists.squid-cache.org
> http://lists.squid-cache.org/listinfo/squid-users
> 



More information about the squid-users mailing list