[squid-users] Header order in squid proxy
Alex Rousskov
rousskov at measurement-factory.com
Thu Jun 22 20:06:22 UTC 2017
On 06/22/2017 12:54 PM, Sonya Roy wrote:
> The sites I am talking about check the User-Agent header and makes sure
> the user-agent is for a well-known browser, i.e. a browser that they
> support. And any browser like Firefox, Chrome, Safari, Edge for example,
> sends the headers in a certain order and the order depends on the
> browser. And this header order for well-known headers like Accept,
> Accept-Language, Accept-Encoding, Content-Length, Host, Connection,
> Referer, Cookie, etc. And they match the order of the received request
> with the standard header order for the browser for that user-agent.
FWIW, Connection and possibly some "etc." headers are hop-by-hop headers
so if a blocking site really pays attention to them, it should be told
to exclude them.
> Could you point me in the direction to where I should look for in the
> source code of squid?
The answer depends on whether you want to:
A) prevent pointless edits (difficult and less effective but has a
fighting chance of official acceptance because it is a useful
performance optimization) or
B) simply reorder all the fields just before sending them, based on a
User-Agent field-driven order table (easy to hack in and effective but
less likely to be officially accepted due to performance overheads and
configuration/support complexities).
If you want a general vague answer, search for calls to non-const
HttpHeader methods like HttpHeader::delByName() and
HttpHeader::insertEntry(). There are about 20-30 potentially relevant
methods AFAICT. And examine the sending code in
HttpStateData::httpBuildRequestHeader().
Please note that the discussion about Squid code belongs to squid-dev,
not squid-users.
HTH,
Alex.
> On Fri, Jun 23, 2017 at 12:02 AM, Alex Rousskov wrote:
>
> On 06/22/2017 11:49 AM, Sonya Roy wrote:
>
> > I noticed that squid changes the header order received from the client
> > before sending it to the origin server.
> >
> > I assume this is because squid parses the header data and adds some
> > headers depending on the config file and then recreates the header data.
>
> IIRC, modern Squids change a header field position when the received
> field is deleted and then added back. This is typical for hop-by-hop
> headers such as Connection, but there are other reasons for Squid to
> delete and add a header field. When the value of the added field is the
> same as the value of the removed field, such pointless "editing" looks
> like mindless "reordering" to the outside observer.
>
> The two actions (field deletion and addition) may happen in a single
> piece of code or may be separated by lots of code and even time.
> Preventing pointless editing in the former cases is straightforward, but
> the latter cases are difficult to handle. Correct avoidance of pointless
> editing may improve performance and, if it does, can be considered a
> useful optimization on its own, regardless of your use case.
>
>
> > Is there any way to prevent this?
>
> Not without changing Squid code (or adding more proxies). However,
> before we even talk about code changes, we should clarify the problem we
> are dealing with. The questions below will guide you.
>
> It is probably much easier to ensure some fixed field send order
> (regardless of the received order) than to preserve the received order.
> Will a fixed order (e.g., always alphabetical) address your use case?
> This feature will hurt performance, but you might be able to convince
> others to accept it if you have a very compelling/specific/detailed use
> case because it can be disabled by default.
>
>
> > I am asking because some sites detect bots using the header order and
> > they drop any such connection. So they unintentionally block squid
> > proxies even if its not being used by a bot.
>
> Are you implying that bots often change header field order between their
> requests? Or that bots often use a different (fixed) header field order
> than the (fixed) field order used by non-bots? Preserving received order
> may help in the former case but not in the latter case.
>
> Also, do those blocking sites pay attention to all headers or just
> end-to-end headers?
>
> Please note that there are many other ways to detect a proxy so if a
> site wants to block proxies rather than bots, then it is probably
> pointless to fight it (or, at least, the Squid Project should not).
>
>
> HTH,
>
> Alex.
>
>
>
>
> _______________________________________________
> squid-users mailing list
> squid-users at lists.squid-cache.org
> http://lists.squid-cache.org/listinfo/squid-users
>
More information about the squid-users
mailing list