[squid-users] How to create a simple whitelist using regexes?

RB ronthecon at gmail.com
Mon Oct 15 15:56:50 UTC 2018


I think I know what the issue is which can give us a clue to what is going
on.

2018/10/15 15:05:45.083 kid1| RegexData.cc(71) match: aclRegexData::match:
checking 'wiki.squid-cache.org:443'
2018/10/15 15:05:45.084 kid1| RegexData.cc(82) match: aclRegexData::match:
looking for '(^
https://wiki.squid-cache.org/SquidFaq/SquidAcl.*)|(squid-cache.org/SquidFaq/SquidAcl.*
)'
2018/10/15 15:05:45.084 kid1| Acl.cc(321) checklistMatches:
ACL::ChecklistMatches: result for 'whitelist' is 0

The above seems to be applying the regex to "wiki.squid-cache.org:443"
instead of to "https://wiki.squid-cache.org/SquidFaq/SquidAcl". I added the
regex ".*squid-cache.org.*" to my list of regular expressions and now I see
this.

2018/10/15 15:16:03.641 kid1| RegexData.cc(71) match: aclRegexData::match:
checking 'wiki.squid-cache.org:443'
2018/10/15 15:16:03.641 kid1| RegexData.cc(82) match: aclRegexData::match:
looking for '(^https?://[^/]+/
wiki.squid-cache.org/SquidFaq/SquidAcl.*)|(squid-cache.org.*)'
2018/10/15 15:16:03.641 kid1| RegexData.cc(93) match: aclRegexData::match:
match '(^https?://[^/]+/
wiki.squid-cache.org/SquidFaq/SquidAcl.*)|(squid-cache.org.*)' found in '
wiki.squid-cache.org:443'
2018/10/15 15:16:03.641 kid1| Acl.cc(321) checklistMatches:
ACL::ChecklistMatches: result for 'whitelist' is 1


Any idea why url_regex wouldn't try to match the full url and instead only
matches on the subdomain, host domain, and port?

The Squid FAQ <https://wiki.squid-cache.org/SquidFaq/SquidAcl> says the
following:

*url_regex*: URL regular expression pattern matching
*urlpath_regex*: URL-path regular expression pattern matching, leaves out
the protocol and hostname


with this example given

acl special_url url_regex ^http://www.squid-cache.org/Doc/FAQ/$


This seems to be the case between 3.3.8 (default on ubuntu 14.04) and
3.5.12 (default on ubuntu 16.04).

Is there another configuration that forces url_regex to match the entire
url? or should I use a different acl type?

Best,

On Mon, Oct 15, 2018 at 11:11 AM RB <ronthecon at gmail.com> wrote:

> Hi Matus,
>
> Thanks for responding so quickly. I uploaded my configurations here if
> that is more helpful: https://bit.ly/2NF4zNb
>
> The config that I previously shared is called squid_corp.conf. I also
> noticed that if I don't use regular expressions and instead use domains, it
> works correctly:
>
> # acl whitelist url_regex "/vagrant/squid_sites.txt"
> acl whitelist url_regex .squid-cache.org
>
>
> Every time my squid.conf or my squid_sites.txt is modified, I restart the
> squid service
>
> sudo service squid3 restart
>
>
> Then I use curl to test and now the url works.
>
> $ curl -sSL --proxy localhost:3128 -D -
> https://wiki.squid-cache.org/SquidFaq/SquidAcl -o /dev/null 2>&1
> HTTP/1.1 200 Connection established
>
> HTTP/1.1 200 OK
> Date: Mon, 15 Oct 2018 14:47:33 GMT
> Server: Apache/2.4.7 (Ubuntu)
> Vary: Cookie,User-Agent,Accept-Encoding
> Content-Length: 101912
> Cache-Control: max-age=3600
> Expires: Mon, 15 Oct 2018 15:47:33 GMT
> Content-Type: text/html; charset=utf-8
>
>
> But this does not allow me to get more granular. I can only allow all
> subdomains and paths for the domain squid-cache.org but I'm unable to
> only allow the regular expressions if I put them inline or put them in
> squid_sites.txt.
>
> # acl whitelist url_regex "/vagrant/squid_sites.txt"
> acl whitelist url_regex ^https://wiki.squid-cache.org/SquidFaq/SquidAcl.*
> acl whitelist url_regex .*squid-cache.org/SquidFaq/SquidAcl.*
>
>
> If I put them inline like I have above, when I restarted squid it says the
> following
>
> 2018/10/15 14:54:48 kid1| strtokFile: .*
> squid-cache.org/SquidFaq/SquidAcl.* not found
>
>
> If I put the expressions in the squid_sites.txt the above "not found"
> message isn't shown and this is the debug output in
> /var/log/squid3/cache.log (full output https://pastebin.com/NVwRxVmQ).
>
> 2018/10/15 15:05:45.083 kid1| Checklist.cc(275) matchNode: 0x7fb0068da2b8
> matched=1 async=0 finished=0
> 2018/10/15 15:05:45.083 kid1| Acl.cc(336) matches: ACLList::matches:
> checking whitelist
> 2018/10/15 15:05:45.083 kid1| Acl.cc(319) checklistMatches:
> ACL::checklistMatches: checking 'whitelist'
> 2018/10/15 15:05:45.083 kid1| RegexData.cc(71) match: aclRegexData::match:
> checking 'wiki.squid-cache.org:443'
> 2018/10/15 15:05:45.084 kid1| RegexData.cc(82) match: aclRegexData::match:
> looking for '(^
> https://wiki.squid-cache.org/SquidFaq/SquidAcl.*)|(squid-cache.org/SquidFaq/SquidAcl.*
> )'
> 2018/10/15 15:05:45.084 kid1| Acl.cc(321) checklistMatches:
> ACL::ChecklistMatches: result for 'whitelist' is 0
> 2018/10/15 15:05:45.084 kid1| Acl.cc(349) matches: whitelist mismatched.
> 2018/10/15 15:05:45.084 kid1| Acl.cc(354) matches: whitelist result is
> false
>
>
> So it's failing the regular expression check. If I use grep to verify if
> the regex works, it does.
>
> $ echo https://wiki.squid-cache.org/SquidFaq/SquidAcl | grep "^
> https://wiki.squid-cache.org/SquidFaq/SquidAcl.*"
> https://wiki.squid-cache.org/SquidFaq/SquidAcl
>
>
> > are you aware that you can only see CONNECT in https requests, unless
> using
> ssl_bump?
>
> Ah interesting. Are you saying that my https connections will always fail
> unless I use ssl_bump to decrypt https to http connections? How would this
> work correctly in production? Does squid proxy only block urls if it
> detects http? How do you configure ssl_bump to work in this case? and is
> that viable in production?
>
> > of course it matches all, everything should match "all".
> > I more wonder why doesn't it match "http_access allow localhost"
>
> > have you reloaded squid config after changing it?
> > Did squid confirm it?
>
> Would you have an example of one entire config file that would work to
> whitelist an http/https url using a regular expression?
>
> Best,
>
>
> On Mon, Oct 15, 2018 at 4:49 AM Matus UHLAR - fantomas <uhlar at fantomas.sk>
> wrote:
>
>> KOn 15.10.18 01:04, RB wrote:
>> >I'm trying to deny all urls except for only whitelisted regular
>> >expressions. I have only this regular expression in my file
>> >"squid_sites.txt"
>> >
>> >^https://wiki.squid-cache.org/SquidFaq/SquidAcl.*
>>
>> are you aware that you can only see CONNECT in https requests, unless
>> using
>> ssl_bump?
>>
>>
>> >acl bastion src 10.5.0.0/1
>> >acl whitelist url_regex "/vagrant/squid_sites.txt"
>> [...]
>> >http_access allow manager localhost
>> >http_access deny manager
>> >http_access deny !Safe_ports
>> >http_access allow localhost
>> >http_access allow purge localhost
>> >http_access deny purge
>> >http_access deny CONNECT !SSL_ports
>> >
>> >http_access allow bastion whitelist
>> >http_access deny bastion all
>>
>> >I tried enabling debugging and tailing /var/log/squid3/cache.log but my
>> >curl statement keeps matching "all".
>>
>> of course it matches all, everything should match "all".
>>
>> I more wonder why doesn't it match "http_access allow localhost"
>>
>> >$ curl -sSL --proxy localhost:3128 -D - "
>> >https://wiki.squid-cache.org/SquidFaq/SquidAcl" -o /dev/null 2>&1 | grep
>> >Squid
>> >X-Squid-Error: ERR_ACCESS_DENIED 0
>>
>> >Any ideas what I'm doing wrong?
>>
>> have you reloaded squid config after changing it?
>> Did squid confirm it?
>>
>> --
>> Matus UHLAR - fantomas, uhlar at fantomas.sk ; http://www.fantomas.sk/
>> Warning: I wish NOT to receive e-mail advertising to this address.
>> Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu.
>> It's now safe to throw off your computer.
>> _______________________________________________
>> squid-users mailing list
>> squid-users at lists.squid-cache.org
>> http://lists.squid-cache.org/listinfo/squid-users
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.squid-cache.org/pipermail/squid-users/attachments/20181015/eaabba43/attachment-0001.html>


More information about the squid-users mailing list