[squid-users] High response times with Squid

Sat Feb 9 04:50:11 UTC 2019

On 8/02/19 7:30 pm, Ahmad, Sarfaraz wrote:
> Hi,
> 
>  
> 
> I am using Squid 4.5 with WCCP. Intercepting SSL by peeking at step1 and
> then deciding to either splice or bump upon the SNI.
> 
> I am noticing a weird behavior for some of my TCP connections.  Squid is
> taking over 20s to decide what do with the ClientHello sent by the
> browser. It is only after 20s that it decides to send out a ClientHello
> to the origin server and at the same time reply to the client with a
> ServerHello.
> 
> This behavior is hard to reproduce and only some clients are affected.
> 
>  
> 
> I will try to summarize what I see in cache.log with ALL, 6 debug options.
> 
>  
> 
> 1)      Squid's INTERCEPTION thread/program receives a TCP SYN from
> workstation.
> 
> 2019/02/06 17:23:19.070 kid1| 89,5| Intercept.cc(405) Lookup: address
> BEGIN: me/client= *<SQUID_IP>:*23129, destination/me=
> *<CLIENT/BROWSER_IP>:*58232
> 

No. This is looking up the original TCP dst-IP:port in the kernel NAT
tables.

>  
> 
> 2)      Squid becomes the origin server and sets up the TCP connection.
> 

No. The local= log values are a simple statement of the TCP packet
values received from the NAT system at (1). Squid is an MITM in this
setup, so the client *thinks* it is talking to the origin.

Being an MITM Squid is designed to operate as transparently as possible,
but at no time has the abilities of the origin server.

> 2019/02/06 17:23:19.070 kid1| 5,5| AsyncCall.cc(93) ScheduleCall:
> TcpAcceptor.cc(339) will call
> httpsAccept(local*=<ORIGIN_SERVER_ON_THE_INTERNET>*:443
> remote=*<CLIENT/BROWSER_IP>:*58232 FD 40 flags=33, MXID_1101703)
> [call34733258]
> 

...
> 
> 8)      No ServerHello has been sent back to the client yet, Squid
> starts a TCP connection with the origin server
> 
> 2019/02/06 17:23:19.110 kid1| 5,4| AsyncJob.cc(123) callStart:
> Comm::ConnOpener status in: [ job2971439]
> 
> 2019/02/06 17:23:19.110 kid1| 5,5| ConnOpener.cc(350) doConnect:
> local=0.0.0.0 remote*=<ORIGIN_SERVER_ON_THE_INTERNET>:*443 flags=1:
> Comm::OK - connected
> 
> 2019/02/06 17:23:19.110 kid1| 5,4| ConnOpener.cc(155) cleanFd:
> local=0.0.0.0 remote=<*ORIGIN_SERVER_ON_THE_INTERNET*>:443 flags=1
> closing temp FD 50
> 
>  
> 
> 9)      Squid starts a TLS session with the remote/origin server, sends
> the ClientHello. A total of 0.4 seconds in Squid sending clienthello to
> origin server. This is probably when Squid decides to send back the
> ServerHello to the browser.

Don't guess. Check.

Either you have step2 / client-first bumping - in which case the Squid
serverHello would have been sent to the client at (7).

Or, you have step3 / server-first bumping - in which case Squid cannot
send a serverHello to the client until it has received the origin's
serverHello. Which still has not yet been received despite your trace
ending here.

...
> 
> 2019/02/06 17:23:19.111 kid1| 83,5| PeerConnector.cc(123) initialize:
> local=*<SQUID_IP>*:44498 remote=*<ORIGIN_SERVER_ON_THE_INTERNET>*:443 FD
> 50 flags=1, session=0x14899390
> 
>  
> 
> So somewhere between Step 8 and Step 9, Squid is taking over 20s.
> 

There is only 1 millisecond between those steps.

The client connection was received at 17:23:19.070, your (9) finished at
17:23:19.111 -> so there is your 0.41 seconds. If there is any 20s gap
for this transaction it is later in the log part you have not shown.

> 
> What could possibly be keeping it busy ?
> 

Other transactions? Nothing?

What is going on at (9) is *preparing* to send a TLS clientHello. At the
point your log stops it still has not actually been written to the network.

There is actually still a good half of the SSl-Bump process to happen:
 - assemble the Squid clientHello bytes,
 - send that to origin
 - receive origin serverHello
 - validate the origin details
 - HTTP fetch missing certificates (if any)
   - re-validate the origin details (repeat fetch as necessary)
 - formulate the Squid serverHello
 - send that to client
 - receive HTTP request over the secured client connection

... then all the HTTP(S) message processing on the resulting connection.

> 
> UPDATE: This problem statement seems local to a few websites. Outside of
> the proxy, those websites quite quickly as is expected.
> 
> 
> Any thoughts on where to look ?  other bits and pieces I could check ?
>  I have jumbo frames enabled (9000 bytes) but am running the proxies at
> L2 1500 MTU.
> 

It is hard to say without also knowing your squid.conf settings and what
sites specifically you are having trouble with,

Could be anything from a packet loss in some remote router halfway
around the world. To misconfiguration of something in your network. To
misconfiguration by the origin server admin.

IMO that last one is most likely if the behaviour is only delay (not
errors) and with only certain domains.

Amos