[squid-users] URL encoding in squid

Amos Jeffries squid3 at treenet.co.nz
Tue Feb 21 20:17:27 UTC 2017


On 21/02/2017 11:43 p.m., Anton wrote:
> Good day.
> 
> I have squid+squidguard configuration. I need to filter a lot o URLs with national
> symbols in it. My URL list consist mostly from percent-encoded URLs. So when squid
> checks such URLs by squidGuard it transmits URL as-is with no percent-encoding.
> 
> SquidGuard see no URL because it has percent-encoded this URL.
> 
> URL list made from "zapret-info" if some one knows :-). It can contain non-consistent data:
> %-encoded URLs can be in cp1251 or utf-8 after decoding and some URLs are not encoded at all.
> I cannot to decode URLs from % in a right way.

Ew.

> 
> IMHO it is better way is to %-encode not-encoded URLs to %-encoded and to use others as is.
> 
> 
> So can squd+squdGuard make dial with percent-encoded URLs ?
> 

Squid should be normalizing the %-encoding on the URLs as they arrive,
but I'm not seeing where it does that in the code so maybe not. What
SquidGuard does with them or its input data file is not under Squid control.

Also SG is very outdated and no longer maintained, you might find
ufdbGuard better able to handle this nasty input. I've bcc'd Marcus in
case there is something he can (or has) do about this type of mess in
that helper.

> Is it possible to path %-encoded URL to squidGuard ?

Not with Squid-3.4. The 3.5 releases have a url_rewrite_extras directive
which takes logformat codes. You could use that to send an extra
%-encoded copy of the URL to the helper in addition to the normal URL
input. (sorry there is no package yet in Debian 8 for 3.5).

Amos



More information about the squid-users mailing list