[squid-users] Linearly increasing delays in HTTPS proxy CONNECTS / 3.5.20
ilari.laitinen at iki.fi
Tue Aug 20 13:12:12 UTC 2019
I am experiencing a slowdown between CONNECT requests and corresponding "200 Connection established" responses. This happens when performance testing a load-balanced pair of squid servers with hundreds of simultaneous, independent, proxied HTTPS requests per second per server.
Most new connections get delayed by an increasing amount of time, raising often to more than 5 seconds. This obviously fails the performance test. Existing connections seem to be unaffected. The problematic situation lasts for a couple of seconds with strikingly linearly increasing delays. Expected operation resumes afterwards with a notable spike in traffic. Sometimes the delay stays at a specific level for a couple of seconds before the issue resolves itself.
squid 3.5.20 on CentOS 7
I have recorded the traffic using tcpdump and it boils down to the following. Everything else is very, very fast even during these slowdown periods.
“source” is one of the servers where the load test originates
“squid” is the server where squid runs
“target” is a cloud service
source —> squid [SYN]
source <— squid [SYN, ACK]
source —> squid [ACK]
source —> squid [CONNECT target:443 HTTP/1.1]
source <— squid [ACK]
[Unexpected delay here]
squid —> target [SYN]
squid <— target [SYN, ACK]
squid —> target [ACK]
source <— squid [HTTP/1.1 200 Connection established]
source —> squid [ACK]
[Rest of the connection here without such delays]
System load stays at comfortable level (aroung 0.6 even while experiencing the issue), memory is not an issue. From
SNMP data, I noticed small but consistent spikes in squid's disk cache usage coinciding with the issue at hand. This seemed strange, given there was no other traffic during the tests and proxied HTTPS means there's nothing to cache (right?). I nevertheless tried switching the cache from ufs to aufs and also using the no-cache option with ufs. Didn’t help. (And the spikes remained…)
I tried increasing the net.ipv4.ip_local_port_range sysctl value (it was set to default). That didn’t help, either.
The servers are located in a IPv4-only local network. Every outgoing request is supposed to be IPv4. The servers do have IPv6 interfaces but there is no traffic at all. Squid periodically queries AAAA records. Is it possible that new connections get queued while squid is busy trying to use IPv6 after receiving the new AAAAs? I have very little control over the environment. Is dns_v4_first worth a try in my scenario?
Is there something I’m missing?
What should I look into next? Could setting up "workers N” help, for example?
More information about the squid-users