[squid-users] High response times with Squid

Ahmad, Sarfaraz Sarfaraz.Ahmad at deshaw.com
Fri Feb 8 06:30:09 UTC 2019


Hi,

I am using Squid 4.5 with WCCP. Intercepting SSL by peeking at step1 and then deciding to either splice or bump upon the SNI.
I am noticing a weird behavior for some of my TCP connections.  Squid is taking over 20s to decide what do with the ClientHello sent by the browser. It is only after 20s that it decides to send out a ClientHello to the origin server and at the same time reply to the client with a ServerHello.
This behavior is hard to reproduce and only some clients are affected.

I will try to summarize what I see in cache.log with ALL, 6 debug options.


1)      Squid's INTERCEPTION thread/program receives a TCP SYN from workstation.
2019/02/06 17:23:19.070 kid1| 89,5| Intercept.cc(405) Lookup: address BEGIN: me/client= <SQUID_IP>:23129, destination/me= <CLIENT/BROWSER_IP>:58232


2)      Squid becomes the origin server and sets up the TCP connection.
2019/02/06 17:23:19.070 kid1| 5,5| AsyncCall.cc(93) ScheduleCall: TcpAcceptor.cc(339) will call httpsAccept(local=<ORIGIN_SERVER_ON_THE_INTERNET>:443 remote=<CLIENT/BROWSER_IP>:58232 FD 40 flags=33, MXID_1101703) [call34733258]
2019/02/06 17:23:19.070 kid1| 5,5| AsyncCall.cc(38) make: make call httpsAccept [call34733258]
       2019/02/06 17:23:19.070 kid1| 33,4| client_side.cc(2776) httpsAccept: local=<ORIGIN_SERVER_ON_THE_INTERNET>:443 remote=<CLIENT/BROWSER_IP>:58232 FD 40 flags=33 accepted, starting SSL negotiation.


3)      Squid checks the SSL ACLs for the destination IP.
2019/02/06 17:23:19.071 kid1| 28,5| Acl.cc(124) matches: checking (ssl_bump rules)
2019/02/06 17:23:19.071 kid1| 28,5| Checklist.cc(397) bannedAction: Action 'ALLOWED/6' is not banned
2019/02/06 17:23:19.071 kid1| 28,5| Acl.cc(124) matches: checking (ssl_bump rule)
2019/02/06 17:23:19.071 kid1| 28,5| Acl.cc(124) matches: checking no_ssl_bump_src_ip
2019/02/06 17:23:19.071 kid1| 28,3| Ip.cc(538) match: aclIpMatchIp: '<CLIENT/BROWSER_IP>:58232' NOT found
2019/02/06 17:23:19.071 kid1| 28,3| Acl.cc(151) matches: checked: no_ssl_bump_src_ip = 0


4)      Squid decides to allow connections to the remote IP i.e <ORIGIN_SERVER_ON_THE_INTERNET> and decides to peek at the SNI (will accept ClientHello), hence fakes a CONNECT request
2019/02/06 17:23:19.071 kid1| 28,3| Checklist.cc(163) checkCallback: ACLChecklist::checkCallback: 0x1fb19fd8 answer=ALLOWED
2019/02/06 17:23:19.071 kid1| 33,2| client_side.cc(2744) httpsSslBumpAccessCheckDone: sslBump action peekneeded for local=<ORIGIN_SERVER_ON_THE_INTERNET>:443 remote=<CLIENT/BROWSER_IP>:58232 FD 40 flags=33
2019/02/06 17:23:19.071 kid1| 33,2| client_side.cc(3395) fakeAConnectRequest: fake a CONNECT request to force connState to tunnel for ssl-bump


5)      The FAKE connect requests again runs through ACLs.  20ms are spent for DNS PTR lookup. A total of 30ms is spent parsing ACLs.
2019/02/06 17:23:19.103 kid1| 85,2| client_side_request.cc(758) clientAccessCheckDone: The request CONNECT <ORIGIN_SERVER_ON_THE_INTERNET> is ALLOWED; last ACL checked: localnet
2019/02/06 17:23:19.103 kid1| 93,4| AccessCheck.cc(145) checkCandidates: NO candidates left
2019/02/06 17:23:19.103 kid1| 93,3| AccessCheck.cc(196) callBack: NULL
2019/02/06 17:23:19.103 kid1| 93,5| AsyncCall.cc(26) AsyncCall: The AsyncCall Adaptation::Initiator::noteAdaptationAclCheckDone constructed, this=0x1552b0b0 [call34733267]


6)      FAKE connect is processed and Squid reads the TCP connection, gets the ClientHello and reads the SNI.
2019/02/06 17:23:19.104 kid1| 33,5| AsyncCall.cc(38) make: make call Server::doClientRead [call34733270]
2019/02/06 17:23:19.104 kid1| 33,5| AsyncJob.cc(123) callStart: Http1::Server status in: [ job2971436]
2019/02/06 17:23:19.104 kid1| 33,5| Server.cc(104) doClientRead: local=<ORIGIN_SERVER_ON_THE_INTERNET>:443 remote=<CLIENT/BROWSER_IP>:58232 FD 40 flags=33
2019/02/06 17:23:19.104 kid1| 5,3| Read.cc(92) ReadNow: local=<ORIGIN_SERVER_ON_THE_INTERNET>:443 remote=<CLIENT/BROWSER_IP>:58232 FD 40 flags=33, size 4096, retval 203, errno 0
2019/02/06 17:23:19.104 kid1| 33,5| AsyncCall.cc(26) AsyncCall: The AsyncCall ConnStateData::requestTimeout constructed, this=0x3d42d40 [call34733271]
2019/02/06 17:23:19.104 kid1| 5,3| comm.cc(559) commSetConnTimeout: local=<ORIGIN_SERVER_ON_THE_INTERNET>:443 remote=<CLIENT/BROWSER_IP>:58232 FD 40 flags=33 timeout 300
2019/02/06 17:23:19.104 kid1| 83,5| Handshake.cc(404) parseExtensions: first unsupported extension: 19018
2019/02/06 17:23:19.104 kid1| 83,3| Handshake.cc(497) parseSniExtension: host_name=<ORIGIN_SERVER_SNI>


7)      This is followed by another round of ACL processing now that we have the SNI.

2019/02/06 17:23:19.106 kid1| 33,3| Pipeline.cc(35) front: Pipeline 0xc2d7b60 front 0x1ae2e730*2
2019/02/06 17:23:19.106 kid1| 33,3| Pipeline.cc(35) front: Pipeline 0xc2d7b60 front 0x1ae2e730*3
2019/02/06 17:23:19.107 kid1| 83,5| Session.cc(103) NewSessionObject: SSL_new session=0x78136a0
2019/02/06 17:23:19.107 kid1| 83,5| bio.cc(616) squid_bio_ctrl: 0xcd0fe80 104(6000, 0x7ffc32a6e6b4)
2019/02/06 17:23:19.107 kid1| 83,5| Session.cc(162) CreateSession: link FD 40 to TLS session=0x78136a0
2019/02/06 17:23:19.107 kid1| 33,5| client_side.cc(2535) httpsCreate: will negotiate TLS on local=<ORIGIN_SERVER_ON_THE_INTERNET>:443 remote=<CLIENT/BROWSER_IP>:58232 FD 40 flags=33
2019/02/06 17:23:19.107 kid1| 5,5| ModEpoll.cc(117) SetSelect: FD 40, type=1, handler=0, client_data=0, timeout=0
2019/02/06 17:23:19.107 kid1| 83,5| client_side.cc(3324) startPeekAndSplice: Peek and splice at step2 done. Start forwarding the request!!!
2019/02/06 17:23:19.107 kid1| 17,3| FwdState.cc(340) Start: '<ORIGIN_SERVER_ON_THE_INTERNET>:443'



8)      No ServerHello has been sent back to the client yet, Squid starts a TCP connection with the origin server
2019/02/06 17:23:19.110 kid1| 5,4| AsyncJob.cc(123) callStart: Comm::ConnOpener status in: [ job2971439]
2019/02/06 17:23:19.110 kid1| 5,5| ConnOpener.cc(350) doConnect: local=0.0.0.0 remote=<ORIGIN_SERVER_ON_THE_INTERNET>:443 flags=1: Comm::OK - connected
2019/02/06 17:23:19.110 kid1| 5,4| ConnOpener.cc(155) cleanFd: local=0.0.0.0 remote=<ORIGIN_SERVER_ON_THE_INTERNET>:443 flags=1 closing temp FD 50


9)      Squid starts a TLS session with the remote/origin server, sends the ClientHello. A total of 0.4 seconds in Squid sending clienthello to origin server. This is probably when Squid decides to send back the ServerHello to the browser.
2019/02/06 17:23:19.110 kid1| 83,5| Session.cc(103) NewSessionObject: SSL_new session=0x14899390
2019/02/06 17:23:19.111 kid1| 83,5| bio.cc(616) squid_bio_ctrl: 0x1492ef80 104(6001, 0x7ffc32a6e884)
2019/02/06 17:23:19.111 kid1| 83,5| Session.cc(162) CreateSession: link FD 50 to TLS session=0x14899390
2019/02/06 17:23:19.111 kid1| 83,5| PeerConnector.cc(123) initialize: local=<SQUID_IP>:44498 remote=<ORIGIN_SERVER_ON_THE_INTERNET>:443 FD 50 flags=1, session=0x14899390

So somewhere between Step 8 and Step 9, Squid is taking over 20s.

What could possibly be keeping it busy ?
I have external ACL helpers but they work just fine. Average service time is 1ms. Squid has not even spawning all helpers that it has been configured to do. (not exhausted the upper limit).
DNS resolution is also good. All CPU/MEM resources look just fine and again this affects only a subset of the traffic.  I don't have the failure logs from when this actually happens.


UPDATE: This problem statement seems local to a few websites. Outside of the proxy, those websites quite quickly as is expected.

Any thoughts on where to look ?  other bits and pieces I could check ?  I have jumbo frames enabled (9000 bytes) but am running the proxies at L2 1500 MTU.

-Ahmad



-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.squid-cache.org/pipermail/squid-users/attachments/20190208/d4a505d7/attachment-0001.html>


More information about the squid-users mailing list