[squid-users] How to create a simple whitelist using regexes?

RB ronthecon at gmail.com
Mon Oct 15 16:48:53 UTC 2018


Hi again...

After some more research it looks like squid only has access to the url
domain if it's HTTPS and the only way to get the url path and query string
is to use ssl_bump to decrypt https so squid can see url path and query
arguments.

To use ssl_bump, I have to compile the code from source with --enable-ssl,
create a certificate, and add it to the chain of certs to every other vm
that proxies through squid, then squid can decrypt the https urls to see
paths and query args and finally apply the regex to those urls in order to
only allow explicit regex urls.

Is this correct?

On Mon, Oct 15, 2018 at 11:56 AM RB <ronthecon at gmail.com> wrote:

> I think I know what the issue is which can give us a clue to what is going
> on.
>
> 2018/10/15 15:05:45.083 kid1| RegexData.cc(71) match: aclRegexData::match:
> checking 'wiki.squid-cache.org:443'
> 2018/10/15 15:05:45.084 kid1| RegexData.cc(82) match: aclRegexData::match:
> looking for '(^
> https://wiki.squid-cache.org/SquidFaq/SquidAcl.*)|(squid-cache.org/SquidFaq/SquidAcl.*
> )'
> 2018/10/15 15:05:45.084 kid1| Acl.cc(321) checklistMatches:
> ACL::ChecklistMatches: result for 'whitelist' is 0
>
> The above seems to be applying the regex to "wiki.squid-cache.org:443"
> instead of to "https://wiki.squid-cache.org/SquidFaq/SquidAcl". I added
> the regex ".*squid-cache.org.*" to my list of regular expressions and now I
> see this.
>
> 2018/10/15 15:16:03.641 kid1| RegexData.cc(71) match: aclRegexData::match:
> checking 'wiki.squid-cache.org:443'
> 2018/10/15 15:16:03.641 kid1| RegexData.cc(82) match: aclRegexData::match:
> looking for '(^https?://[^/]+/
> wiki.squid-cache.org/SquidFaq/SquidAcl.*)|(squid-cache.org.*
> <http://wiki.squid-cache.org/SquidFaq/SquidAcl.*)%7C(squid-cache.org.*>)'
> 2018/10/15 15:16:03.641 kid1| RegexData.cc(93) match: aclRegexData::match:
> match '(^https?://[^/]+/
> wiki.squid-cache.org/SquidFaq/SquidAcl.*)|(squid-cache.org.*
> <http://wiki.squid-cache.org/SquidFaq/SquidAcl.*)%7C(squid-cache.org.*>)'
> found in 'wiki.squid-cache.org:443'
> 2018/10/15 15:16:03.641 kid1| Acl.cc(321) checklistMatches:
> ACL::ChecklistMatches: result for 'whitelist' is 1
>
>
> Any idea why url_regex wouldn't try to match the full url and instead only
> matches on the subdomain, host domain, and port?
>
> The Squid FAQ <https://wiki.squid-cache.org/SquidFaq/SquidAcl> says the
> following:
>
> *url_regex*: URL regular expression pattern matching
> *urlpath_regex*: URL-path regular expression pattern matching, leaves out
> the protocol and hostname
>
>
> with this example given
>
> acl special_url url_regex ^http://www.squid-cache.org/Doc/FAQ/$
>
>
> This seems to be the case between 3.3.8 (default on ubuntu 14.04) and
> 3.5.12 (default on ubuntu 16.04).
>
> Is there another configuration that forces url_regex to match the entire
> url? or should I use a different acl type?
>
> Best,
>
> On Mon, Oct 15, 2018 at 11:11 AM RB <ronthecon at gmail.com> wrote:
>
>> Hi Matus,
>>
>> Thanks for responding so quickly. I uploaded my configurations here if
>> that is more helpful: https://bit.ly/2NF4zNb
>>
>> The config that I previously shared is called squid_corp.conf. I also
>> noticed that if I don't use regular expressions and instead use domains, it
>> works correctly:
>>
>> # acl whitelist url_regex "/vagrant/squid_sites.txt"
>> acl whitelist url_regex .squid-cache.org
>>
>>
>> Every time my squid.conf or my squid_sites.txt is modified, I restart the
>> squid service
>>
>> sudo service squid3 restart
>>
>>
>> Then I use curl to test and now the url works.
>>
>> $ curl -sSL --proxy localhost:3128 -D -
>> https://wiki.squid-cache.org/SquidFaq/SquidAcl -o /dev/null 2>&1
>> HTTP/1.1 200 Connection established
>>
>> HTTP/1.1 200 OK
>> Date: Mon, 15 Oct 2018 14:47:33 GMT
>> Server: Apache/2.4.7 (Ubuntu)
>> Vary: Cookie,User-Agent,Accept-Encoding
>> Content-Length: 101912
>> Cache-Control: max-age=3600
>> Expires: Mon, 15 Oct 2018 15:47:33 GMT
>> Content-Type: text/html; charset=utf-8
>>
>>
>> But this does not allow me to get more granular. I can only allow all
>> subdomains and paths for the domain squid-cache.org but I'm unable to
>> only allow the regular expressions if I put them inline or put them in
>> squid_sites.txt.
>>
>> # acl whitelist url_regex "/vagrant/squid_sites.txt"
>> acl whitelist url_regex ^https://wiki.squid-cache.org/SquidFaq/SquidAcl.*
>> acl whitelist url_regex .*squid-cache.org/SquidFaq/SquidAcl.*
>>
>>
>> If I put them inline like I have above, when I restarted squid it says
>> the following
>>
>> 2018/10/15 14:54:48 kid1| strtokFile: .*
>> squid-cache.org/SquidFaq/SquidAcl.* not found
>>
>>
>> If I put the expressions in the squid_sites.txt the above "not found"
>> message isn't shown and this is the debug output in
>> /var/log/squid3/cache.log (full output https://pastebin.com/NVwRxVmQ).
>>
>> 2018/10/15 15:05:45.083 kid1| Checklist.cc(275) matchNode: 0x7fb0068da2b8
>> matched=1 async=0 finished=0
>> 2018/10/15 15:05:45.083 kid1| Acl.cc(336) matches: ACLList::matches:
>> checking whitelist
>> 2018/10/15 15:05:45.083 kid1| Acl.cc(319) checklistMatches:
>> ACL::checklistMatches: checking 'whitelist'
>> 2018/10/15 15:05:45.083 kid1| RegexData.cc(71) match:
>> aclRegexData::match: checking 'wiki.squid-cache.org:443'
>> 2018/10/15 15:05:45.084 kid1| RegexData.cc(82) match:
>> aclRegexData::match: looking for '(^
>> https://wiki.squid-cache.org/SquidFaq/SquidAcl.*)|(squid-cache.org/SquidFaq/SquidAcl.*
>> )'
>> 2018/10/15 15:05:45.084 kid1| Acl.cc(321) checklistMatches:
>> ACL::ChecklistMatches: result for 'whitelist' is 0
>> 2018/10/15 15:05:45.084 kid1| Acl.cc(349) matches: whitelist mismatched.
>> 2018/10/15 15:05:45.084 kid1| Acl.cc(354) matches: whitelist result is
>> false
>>
>>
>> So it's failing the regular expression check. If I use grep to verify if
>> the regex works, it does.
>>
>> $ echo https://wiki.squid-cache.org/SquidFaq/SquidAcl | grep "^
>> https://wiki.squid-cache.org/SquidFaq/SquidAcl.*"
>> https://wiki.squid-cache.org/SquidFaq/SquidAcl
>>
>>
>> > are you aware that you can only see CONNECT in https requests, unless
>> using
>> ssl_bump?
>>
>> Ah interesting. Are you saying that my https connections will always fail
>> unless I use ssl_bump to decrypt https to http connections? How would this
>> work correctly in production? Does squid proxy only block urls if it
>> detects http? How do you configure ssl_bump to work in this case? and is
>> that viable in production?
>>
>> > of course it matches all, everything should match "all".
>> > I more wonder why doesn't it match "http_access allow localhost"
>>
>> > have you reloaded squid config after changing it?
>> > Did squid confirm it?
>>
>> Would you have an example of one entire config file that would work to
>> whitelist an http/https url using a regular expression?
>>
>> Best,
>>
>>
>> On Mon, Oct 15, 2018 at 4:49 AM Matus UHLAR - fantomas <uhlar at fantomas.sk>
>> wrote:
>>
>>> KOn 15.10.18 01:04, RB wrote:
>>> >I'm trying to deny all urls except for only whitelisted regular
>>> >expressions. I have only this regular expression in my file
>>> >"squid_sites.txt"
>>> >
>>> >^https://wiki.squid-cache.org/SquidFaq/SquidAcl.*
>>>
>>> are you aware that you can only see CONNECT in https requests, unless
>>> using
>>> ssl_bump?
>>>
>>>
>>> >acl bastion src 10.5.0.0/1
>>> >acl whitelist url_regex "/vagrant/squid_sites.txt"
>>> [...]
>>> >http_access allow manager localhost
>>> >http_access deny manager
>>> >http_access deny !Safe_ports
>>> >http_access allow localhost
>>> >http_access allow purge localhost
>>> >http_access deny purge
>>> >http_access deny CONNECT !SSL_ports
>>> >
>>> >http_access allow bastion whitelist
>>> >http_access deny bastion all
>>>
>>> >I tried enabling debugging and tailing /var/log/squid3/cache.log but my
>>> >curl statement keeps matching "all".
>>>
>>> of course it matches all, everything should match "all".
>>>
>>> I more wonder why doesn't it match "http_access allow localhost"
>>>
>>> >$ curl -sSL --proxy localhost:3128 -D - "
>>> >https://wiki.squid-cache.org/SquidFaq/SquidAcl" -o /dev/null 2>&1 |
>>> grep
>>> >Squid
>>> >X-Squid-Error: ERR_ACCESS_DENIED 0
>>>
>>> >Any ideas what I'm doing wrong?
>>>
>>> have you reloaded squid config after changing it?
>>> Did squid confirm it?
>>>
>>> --
>>> Matus UHLAR - fantomas, uhlar at fantomas.sk ; http://www.fantomas.sk/
>>> Warning: I wish NOT to receive e-mail advertising to this address.
>>> Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu.
>>> It's now safe to throw off your computer.
>>> _______________________________________________
>>> squid-users mailing list
>>> squid-users at lists.squid-cache.org
>>> http://lists.squid-cache.org/listinfo/squid-users
>>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.squid-cache.org/pipermail/squid-users/attachments/20181015/4ded9531/attachment-0001.html>


More information about the squid-users mailing list