[squid-users] Allow some domains to bypass Squid

Amos Jeffries squid3 at treenet.co.nz
Sun Mar 11 10:17:54 UTC 2018


On 11/03/18 21:33, Nicolas Kovacs wrote:
> Le 11/03/2018 à 09:24, Amos Jeffries a écrit :
>> What you need to start with is switch your thinking from "domains" to
>> considering things in terms of connections and individual servers. Since
>> "domain" is a URL concept, and URLs are all hidden inside the encrypted
>> part of the traffic there is no knowing what that really is until after
>> decryption.
>>
>> However when dealing with servers and connections, the connections TLS
>> SNI can tell you which *server* a client is connecting to and you can
>> decide to do the splice action based on which servers you are having
>> trouble with (not domains).
>>
>> Or better yet, decide even earlier in your NAT system not to send that
>> traffic to the proxy at all.
> 
> I'm sorry, but I don't understand what you're saying.
> 

Once the traffic arrives at the proxy it MUST be handled. It is too late
to send it elsewhere.

So to actually bypass the proxy you have to not send the TCP packets to
it at all. But the NAT system only works with raw-IPs.


There are many ways to "handle" traffic though. Rejection is one. When
TLS is involved relaying without doing anything (aka splice) is another.

But TLS is a point-to-point security protocol. Its handshakes are
dealing with origin server names. The domain name is a secondary detail
only sometimes available at all.

see
<https://superuser.com/questions/59093/difference-between-host-name-and-domain-name/59094>



> Here's what I want, It's very simple.
> 
> Create a text file that contains a list of domains. For example:
> 
>   google.com

Talking this as an example;

"google.com" is the public domain that users enter into their browsers
to view a certain website. The browser than does a lot of stuff, and
eventually contacts one of the origin servers for "google.com".

But "google.com" is not actually the name for any of those origin
servers. The server names for Google machines are all inside the
*.1e100.net TLD name, and all their machines answer to many different
"domain names" at the HTTP level (gmail.com, google.com, youtube.com,
googlevideo.com, 1e100.net, ... and many others including all the
country-specific ccTLDs variations on those names, and a lot of common
typos eg "gogle.com").


Your MITM proxy does not receive a URL straight off. It receives a TCP
SYN+ACK packet details. Which contains only the raw-IP for that Google
server. It can lookup the DNS to find out the servers name
 ... something.1e100.net.

If you configured it to handle the TLS handshake (with ssl-bump) it will
receive various representations of that server name in TLS messages.
Which should still be something.1e100.net, usually not "google.com" -
but that depends on whether the client software (Browser or non-Browser)
is properly obeying the requirement that it indicate *server name* in
TLS SNI.
 And also on whether the Google company servers specify the "google.com"
domain as an alias (SubjectAltName) for their TLS certificate from any
particular server (some do, some do not).


All of the above has to happen and be acceptable to the proxy access
controls you have configured before it gets a chance to decrypt the HTTP
message inside the TLS encryption ... and finally find out what URL for
that message is with its domain name.
 Only then can it re-process those access controls for the HTTP(S)
message itself using the actual domain name the client wants.

The above is why we have different "dstdomain" and "ssl::server_name"
ACL types to deal with the different name information available.



>   hotmail.com
>   github.com
>   credit-cooperatif.fr
> 
> And then all connections that go to anyone of these domains don't get
> cached, but simply pass through Squid.

The process is not getting anywhere close to caching being relevant. The
error you mentioned earlier is in the TLS handshake part of the process.


HTH
Amos


More information about the squid-users mailing list