[squid-users] IPcache and mixed case domain names

Binoy Fernandez binoyaf at yahoo.com
Wed Dec 15 16:56:08 UTC 2021


hi everyone,

I hope all is well. I have a question on the behaviour of IPcache with mixed case domain names that I hope you can help clarify.

Environment: Squid version 4.15 as a forward proxy, using sslbump peek (step 1) and splice (step 2) and TProxy.

I noticed a low cache hit ratio on the IPcache module in an environment where a large proportion (90%) of requests is to mixed case domain names. On reviewing the code, it seems the lookup is done using the domain name as-is (mixed case) whereas the entries in IPcache are stored in lower case. I had a brief review of the code and logs and this is what I think is happening:

- curl a mixed case domain (e.g. 2D3755028EC4F578005E3EDC9A7E34F3.gr7.us-east-2.eks.amazonaws.com)
 -- On this first request, Squid tries to lookup the domain name (ipcache_get) after it gets the Client Hello from the client and before it initiates a TCP connection to the upstream. This results in a cache miss (expected). Domain name here comes from the Client Hello SNI.
 -- It then creates an instance of ipcache_entry and requests the internal DNS module to resolve the domain name. ipcache_entry's constructor lower cases the domain name and stores it in the hash.key field.
 -- Once the DNS response comes in, the callback function (ipcacheHandleReply) is invoked. This adds an entry (ipcacheAddEntry) for this DNS lookup to IPcache. The key here is the domain name in lower case.

- curl the same domain a second time (before the TTL expires)
 -- Squid tries to lookup the domain name (ipcache_get) as before but results in a cache miss (not expected). The argument to ipcache_get I believe is in mixed case at this point and hence the cache miss?

Access logs is as follows for the connection from Squid to upstream:
Request 1 -
[15/Dec/2021:16:12:12]   2409 11 10.1.3.204:49852 3129 TCP_TUNNEL/200 2978 CONNECT 2D3755028EC4F578005E3EDC9A7E34F3.gr7.us-east-2.eks.amazonaws.com:443 - HIER_DIRECT/2D3755028EC4F578005E3EDC9A7E34F3.gr7.us-east-2.eks.amazonaws.com 3.17.195.231

Request 2 -
[15/Dec/2021:16:12:51]   2394 2 10.1.3.204:58366 3129 TCP_TUNNEL/200 3000 CONNECT 2D3755028EC4F578005E3EDC9A7E34F3.gr7.us-east-2.eks.amazonaws.com:443 - HIER_DIRECT/2D3755028EC4F578005E3EDC9A7E34F3.gr7.us-east-2.eks.amazonaws.com 3.143.136.45

Log format - [%tg] %6tr %dt %>a:%>p %lp %Ss/%03Hs %<st %rm %ru %ue %Sh/%<A %<a

If I repeat the same test case but with the domain name in lower case (2d3755028ec4f578005e3edc9a7e34f3.gr7.us-east-2.eks.amazonaws.com) then the second request has a "-" for DNS resolution time (%dt) in the access log indicating a cache hit. (same is also reflected in the cache logs)

Assuming the IPcache at all times contains lower case domain names then I think a change might be needed to the ipcache_get function to lower case the second argument to hash_lookup.
- github.com/squid-cache/squid/blob/master/src/ipcache.cc#L325 

Apologies if this way off the mark, I am not familiar with the Squid code base. Please advise if this observation is incorrect and I can then shift focus to other areas to troubleshoot.

Thank you, 
Binoy Fernandez


More information about the squid-users mailing list