[squid-users] Header order in squid proxy
Sonya Roy
sonyaroy75 at gmail.com
Thu Jun 22 18:54:25 UTC 2017
The sites I am talking about check the User-Agent header and makes sure the
user-agent is for a well-known browser, i.e. a browser that they support.
And any browser like Firefox, Chrome, Safari, Edge for example, sends the
headers in a certain order and the order depends on the browser. And this
header order for well-known headers like Accept, Accept-Language,
Accept-Encoding, Content-Length, Host, Connection, Referer, Cookie, etc.
And they match the order of the received request with the standard header
order for the browser for that user-agent.
This detects bots like a poorly written bot(i.e ones that don't consider
this header order) using python requests or in any language for that matter
where the requests are handled using a low level http requests library.
So, keeping the header order sent from the client intact would prevent them
from dropping proxied requests(ones that use squid). I know for a fact that
they don't intend to block proxies.
Could you point me in the direction to where I should look for in the
source code of squid? the part that handles the header data sent from the
client.
With regards,
Sonya Roy.
On Fri, Jun 23, 2017 at 12:02 AM, Alex Rousskov <
rousskov at measurement-factory.com> wrote:
> On 06/22/2017 11:49 AM, Sonya Roy wrote:
>
> > I noticed that squid changes the header order received from the client
> > before sending it to the origin server.
> >
> > I assume this is because squid parses the header data and adds some
> > headers depending on the config file and then recreates the header data.
>
> IIRC, modern Squids change a header field position when the received
> field is deleted and then added back. This is typical for hop-by-hop
> headers such as Connection, but there are other reasons for Squid to
> delete and add a header field. When the value of the added field is the
> same as the value of the removed field, such pointless "editing" looks
> like mindless "reordering" to the outside observer.
>
> The two actions (field deletion and addition) may happen in a single
> piece of code or may be separated by lots of code and even time.
> Preventing pointless editing in the former cases is straightforward, but
> the latter cases are difficult to handle. Correct avoidance of pointless
> editing may improve performance and, if it does, can be considered a
> useful optimization on its own, regardless of your use case.
>
>
> > Is there any way to prevent this?
>
> Not without changing Squid code (or adding more proxies). However,
> before we even talk about code changes, we should clarify the problem we
> are dealing with. The questions below will guide you.
>
> It is probably much easier to ensure some fixed field send order
> (regardless of the received order) than to preserve the received order.
> Will a fixed order (e.g., always alphabetical) address your use case?
> This feature will hurt performance, but you might be able to convince
> others to accept it if you have a very compelling/specific/detailed use
> case because it can be disabled by default.
>
>
> > I am asking because some sites detect bots using the header order and
> > they drop any such connection. So they unintentionally block squid
> > proxies even if its not being used by a bot.
>
> Are you implying that bots often change header field order between their
> requests? Or that bots often use a different (fixed) header field order
> than the (fixed) field order used by non-bots? Preserving received order
> may help in the former case but not in the latter case.
>
> Also, do those blocking sites pay attention to all headers or just
> end-to-end headers?
>
> Please note that there are many other ways to detect a proxy so if a
> site wants to block proxies rather than bots, then it is probably
> pointless to fight it (or, at least, the Squid Project should not).
>
>
> HTH,
>
> Alex.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.squid-cache.org/pipermail/squid-users/attachments/20170623/43740d9e/attachment.html>
More information about the squid-users
mailing list