[squid-dev] Rock store stopped accessing discs

Alex Rousskov rousskov at measurement-factory.com
Tue Mar 7 18:22:08 UTC 2017


On 03/07/2017 10:58 AM, Heiler Bemerguy wrote:

> I used iostat to check if "right now" the hds were being accessed. A
> lot of minutes passed and all writes/reads remained Zero. 

Understood.


> With a 80mbit/s traffic going on, how could nothing be written nor read from
> disc? 

I can come up with several possible explanations, but more information
is needed to minimize guessing. See below for some suggestions.


> How can I check IPC RAM? I've never tweaked it.

Search Squid wiki for SMP troubleshooting hints and/or ask your local
sysadmin. If you cannot find anything relevant, ask here again and,
hopefully somebody will be able to guide you.


> I've just noticed that squid was running since feb/18 (Start Time:    
> Sat, 18 Feb 2017 15:38:44 GMT) and since the beginning there were a lot
> of warnings on cache.log.. (The logs I pasted on the earlier email was
> from today's usage only..)
> I think since then, it stopped using the cache stores..

I assume "then" means "today's usage" not "feb/18".

> 2017/02/18 13:48:19 kid3| ERROR: worker I/O push queue for /cache4/rock overflow: ipcIo3.9082w9
> 2017/02/18 13:48:42 kid4| ERROR: worker I/O push queue for /cache4/rock overflow: ipcIo4.3371w9
> 2017/02/18 14:06:01 kid9| WARNING: /cache4/rock delays I/O requests for 9.97 seconds to obey 200/sec rate limit
> 2017/02/18 14:06:34 kid9| WARNING: /cache4/rock delays I/O requests for 21.82 seconds to obey 200/sec rate limit
> 2017/02/18 14:06:42 kid4| WARNING: abandoning 1 /cache4/rock I/Os after at least 7.00s timeout
> 2017/02/18 14:06:47 kid3| WARNING: abandoning 1 /cache4/rock I/Os after at least 7.00s timeout
> 2017/02/18 14:06:48 kid1| WARNING: abandoning 1 /cache4/rock I/Os after at least 7.00s timeout
> 2017/02/18 14:06:49 kid4| WARNING: abandoning 4 /cache4/rock I/Os after at least 7.00s timeout
> 2017/02/18 14:06:54 kid3| WARNING: abandoning 2 /cache4/rock I/Os after at least 7.00s timeout
> 2017/02/18 14:07:55 kid9| WARNING: /cache4/rock delays I/O requests for 68.64 seconds to obey 200/sec rate limit
> 2017/02/18 14:08:03 kid5| WARNING: abandoning 511 /cache4/rock I/Os after at least 7.00s timeout
> 2017/02/18 14:08:47 kid2| WARNING: abandoning 20 /cache4/rock I/Os after at least 7.00s timeout
> 2017/02/18 14:08:51 kid3| WARNING: abandoning 41 /cache4/rock I/Os after at least 7.00s timeout
> 2017/02/18 14:08:54 kid1| WARNING: abandoning 41 /cache4/rock I/Os after at least 7.00s timeout
> 2017/02/18 15:26:35 kid5| ERROR: worker I/O push queue for /cache4/rock overflow: ipcIo5.31404w9
> 2017/02/18 15:29:00 kid9| WARNING: /cache4/rock delays I/O requests for 9.92 seconds to obey 200/sec rate limit
> 2017/02/18 15:29:13 kid9| WARNING: /cache4/rock delays I/O requests for 8.23 seconds to obey 200/sec rate limit
> 2017/02/18 15:29:45 kid9| WARNING: /cache4/rock delays I/O requests for 8.86 seconds to obey 200/sec rate limit
> 2017/02/18 15:30:06 kid9| WARNING: /cache4/rock delays I/O requests for 7.34 seconds to obey 200/sec rate limit
> 2017/02/18 15:30:27 kid9| WARNING: /cache4/rock delays I/O requests for 7.65 seconds to obey 200/sec rate limit
> 2017/02/18 15:30:48 kid9| WARNING: /cache4/rock delays I/O requests for 8.97 seconds to obey 200/sec rate limit
> 2017/02/18 15:31:09 kid9| WARNING: /cache4/rock delays I/O requests for 8.52 seconds to obey 200/sec rate limit
> 2017/02/18 15:31:22 kid9| WARNING: /cache4/rock delays I/O requests for 10.61 seconds to obey 200/sec rate limit
> 2017/02/18 17:19:40 kid9| WARNING: /cache4/rock delays I/O requests for 10.22 seconds to obey 200/sec rate limit

There is a Squid bug and/or your cache disks could not keep up with the
I/O load. Please note that I/O load during initial cache index rebuild
is higher.

There are several ways to proceed IMO:

1. Figure out why your Squid is not using the disk cache _now_. This
will require enabling debugging, at least for a few seconds, and then
analyzing cache.log. I recommend enabling debugging in one worker kid
only (i.e., sending its process the right signal instead of running
"squid -k debug"). See Squid wiki for debugging details. Do not restart
or reconfigure Squid until you collect those debugging logs! Please note
that the logs may contain private user info.

2. Reducing max-swap-rate from 200 to, say, 20. If your disks cannot
keep up _and_ there is a Squid bug that screws something up when your
disks cannot keep up, then this blind configuration change may avoid
triggering that bug.

3. Collect enough iostat 5-second outputs (or equivalent) to correlate
system performance with cache.log messages. I would also collect other
system activity during those hours. The "atop" tool may be useful for
collecting everything in one place. You will probably want to restart
Squid for a clean experiment/collection.


HTH,

Alex.



More information about the squid-dev mailing list