[squid-users] SQUID memory error after vm.swappines changed from 60 to 10

Fri Nov 10 14:11:27 UTC 2017

On Thu, Nov 9, 2017 at 5:13 PM, Marcus Kool <marcus.kool at urlfilterdb.com> wrote:
>
>
> On 09/11/17 11:04, Bike dernikov1 wrote:
> [snip]
>>>>
>>>> Memory compsumption:squid use largest part of memory  (12GB now,
>>>> second proces use 300MB memory), 14GB used by all process. So squid
>>>> use over 80% of total used memory.
>>>> So no there are not any problematic process. But we changed swappiness
>>>> settings.
>>>
>>>
>>> Did you monitor Squid for growth (it can start with 12 GB and grow
>>> slowly) ?
>>
>>
>> Yes we are monitoring continuosly.
>> Now:
>> Output from free -m.
>>
>>             total       used    free   shared  buff/cache  available
>> Mem:  24101     20507  256    146      3337         3034
>> Swap: 24561      5040   19521
>>
>> vm.swappiness=40
>>
>> Memory by process:
>> squid  Virt       RES   SHR  MEM%
>>             22,9G  18.7   8164   79,6
>
>
> Hmm. Squid grew from 12 GB to 18.7 GB (23 GB virtual).

Today problem appeared again after logrotate at 2.56AM.
Used memory was at peek 23,7GB.
Before logrorate started, cached was at 2GB, buffer at 1,5GB.
After logrorate started cache jumped to 3.7GB and buffer unchanged at 1,5GB.

Fork errors stopped after 1 minute. At 2:57.
cache memory dropped by 500MB  to 3.2GB and continued at same level
till morning, buffer  same at 1.5GB.

After 4 at 3:00 minutes new WARNING appeared. external ACL queue
overload. Using stale results.

We have night shift and they told us that Internet worked ok.

After restart at around 7.00AM used memory dropped from 22 GB to 7GB,
cache and buffer remain at same levels.

> With vm.swappiness=40 Linux starts to page out parts of processes when they
> occupy more than 60% of the memory.
> This is a potential bottleneck and I would have also decreased vm.swappiness
> to 10 as you did.
>
> My guess is that Squid starts too many helpers in a short time frame and
> that because of paging there are too many forks in progress simultaneously
> which causes the memory exhaustion.

We are now testing with 100 helpers for negotiate_kerberos_auth.
vm.swappiness returned to 60.

> I suggest to reduce the memory cache of Squid by 50% and set vm.swappiness
> to 20.

Squid cache memory is set at 14GB reduced from 16GB from 20GB  in two turns.

> And then observe:
> - total memory use
> - total swap usage (should be lower than the 5 GB that you have now)
> - number of helper processes that are started in short time frames
> And then in small steps increase the memory cache and maybe further reduce
> vm.swappiness to 10.

If we survive with actual setup, we will continue with reducing as you suggest.
Last extreme will be swap disable swappof but just for test with 6
eyes on monitoring :)

>> squidguard two process  300MB boths,.
>>
>> CPU 0.33 0.37 0.43
>>
>>> Squid cannot fork and higher swappiness increases the amount of memory
>>> that
>>> the OS can use to copy processes.
>>> It makes me think that you have the memory overcommit set to 2 (no
>>> overcommit).
>>> What is the output of the following command ?
>>>     sysctl  -a | grep overcommit
>>
>>
>> Command output:
>>
>> vm.nr_overcommit_hugepages = 0
>> vm.overcommit_kbytes = 0
>> vm.overcommit_memory = 0
>> vm.overcommit_ratio = 50
>>
>> cat /proc/sys/vm/overcommit_memory
>> 0
>
>
> The overcommit settings look fine.

At least something right :)

>>
>>>> Advice for some settings:
>>>> We have absolute max peak of  2500 users which user squid (of 2800),
>>>> what are recomended settings for:
>>>> negotiate_kerberos_children start/idle
>>>> squidguard helpers.
>>>
>>>
>>>
>>> I have little experience with kerberos, but most likely this is not the
>>> issue.
>>> When Squid cannot fork the helpers, helper settings do not matter much.
>>
>>
>>> For 2500 users you probably need 32-64 squidguard helpers.
>>
>>
>> Can you confirm: For 2500 users:
>>
>> url_rewrite children X (squidguard)  32-64 will be ok ? We have set
>> much larger number.

Squidguard url_rewrite children was set to 64.

> Did I understand it correctly that earlier in this reply you said that there
> are two squidguard processes (300 MB each).

Yes (first two process in htop, two rewrite childrens) others was on 0.0%.

> ufdbGuard is faster than squidGuard and has multithreaded helpers.
> ufdbGuard needs less helpers than squidGuard.
> If you have a much larger number than 64 url rewrite helpers than I suggest
> to switch to ufdbGuard as soon as possible since the memory usage is then at
> least 600% less.

UfdbGuard have few strong features. Development, kerberos,
concurency/multitreading.
As i wrote, if we read documentation slower we wouldn't
Do ufdbGuard supoort ldap secure auth ? We tried ldap secure with
squidguard without success.

>> For  helper:
>> negotitate_kerberos_auth
>>
>> auth_param negotiate children X startup Y idle Z. What X, Y, Z are
>> best for our user number ?
>>
>> We disabled kerberos replay cache because of disk performance (4 SAS
>> DISK  15K, RAID 10) (iowait jumped high, and CPU load jumped to min
>> 40 max 200).
>> We don't use disk caching.
>>
>> Thanks for help,
>>
>>> Marcus
>>>
>>>
>>>> Thanks for help,
>>>>
>>>> On Wed, Nov 8, 2017 at 10:53 AM, Marcus Kool
>>>> <marcus.kool at urlfilterdb.com> wrote:
>>>>>
>>>>>
>>>>> There is definitely a problem with available memory because Squid
>>>>> cannot
>>>>> fork.
>>>>> So start with looking at how much memory Squid and its helpers use.
>>>>> Do do have other processes on this system that consume a lot of memory
>>>>> ?
>>>>>
>>>>> Also note that ufdbGuard uses less memory that squidGuard.
>>>>> If there are 30 helpers squidguard uses 300% more memory than
>>>>> ufdbGuard.
>>>>>
>>>>> Look at the wiki for more information about memory usage:
>>>>> https://wiki.squid-cache.org/SquidFaq/SquidMemory   (currently has an
>>>>> expired certificate but it is safe to go ahead)
>>>>>
>>>>> Marcus
>>>>>
>>>>>
>>>>>
>>>>> On 08/11/17 07:26, Bike dernikov1 wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>> Hi, I hope that someone can explain what happened, why squid stopped
>>>>>> working.
>>>>>> The problem is related to  memory/swap handling.
>>>>>>
>>>>>> After we changed vm.swappiness parameter from 60 to 10 (tuning
>>>>>> attempt, to lower a disk usage, because we have only 4 disks in a
>>>>>> RAID10, so disk subsystem  is a weak link), we got a lot of errors in
>>>>>> cache.log.
>>>>>> The problems started after scheduled logrotate after  2AM.
>>>>>> Squid ran out of memory, auth helpers stopped working.
>>>>>> It's weird because we didn't disable swap, but behavior is like we
>>>>>> did.
>>>>>> After an error, we increased parameter from 10 to 40.
>>>>>>
>>>>>> The server has 24GB DDR3 memory,  disk swap set to 24GB, 12 CPU (24HT
>>>>>> cores).
>>>>>> We have 2800 users, using  kerberos authentication, squidguard for
>>>>>> filtering, ldap authorization.
>>>>>> When problem appeared memory was still 3GB free (free column), ram
>>>>>> (caching) was filled to 15GB, so 21 GB ram filled, 3GB free.
>>>>>>
>>>>>> Thanks for help,
>>>>>>
>>>>>>
>>>>>> errors from cache.log.
>>>>>>
>>>>>> 2017/11/08 02:55:27| Set Current Directory to /var/log/squid/
>>>>>> 2017/11/08 02:55:27 kid1| storeDirWriteCleanLogs: Starting...
>>>>>> 2017/11/08 02:55:27 kid1|   Finished.  Wrote 0 entries.
>>>>>> 2017/11/08 02:55:27 kid1|   Took 0.00 seconds (  0.00 entries/sec).
>>>>>> 2017/11/08 02:55:27 kid1| logfileRotate:
>>>>>> daemon:/var/log/squid/access.log
>>>>>> 2017/11/08 02:55:27 kid1| logfileRotate:
>>>>>> daemon:/var/log/squid/access.log
>>>>>> 2017/11/08 02:55:28 kid1| Pinger socket opened on FD 30
>>>>>> 2017/11/08 02:55:28 kid1| helperOpenServers: Starting 1/1000
>>>>>> 'squidGuard' processes
>>>>>> 2017/11/08 02:55:28 kid1| ipcCreate: fork: (12) Cannot allocate memory
>>>>>> 2017/11/08 02:55:28 kid1| WARNING: Cannot run '/usr/bin/squidGuard'
>>>>>> process.
>>>>>> 2017/11/08 02:55:28 kid1| helperOpenServers: Starting 300/3000
>>>>>> 'negotiate_kerberos_auth' processes
>>>>>> 2017/11/08 02:55:28 kid1| ipcCreate: fork: (12) Cannot allocate memory
>>>>>> 2017/11/08 02:55:28 kid1| WARNING: Cannot run
>>>>>> '/usr/lib/squid/negotiate_kerberos_auth' process.
>>>>>> 2017/11/08 02:55:28 kid1| ipcCreate: fork: (12) Cannot allocate memory
>>>>>> 2017/11/08 02:55:28 kid1| WARNING: Cannot run
>>>>>> '/usr/lib/squid/negotiate_kerberos_auth' process.
>>>>>> 2017/11/08 02:55:28 kid1| ipcCreate: fork: (12) Cannot allocate memory
>>>>>> 2017/11/08 02:55:28 kid1| WARNING: Cannot run
>>>>>> '/usr/lib/squid/negotiate_kerberos_auth' process.
>>>>>>
>>>>>> external ACL 'memberof' queue overload. Using stale result.
>>>>>> _______________________________________________
>>>>>> squid-users mailing list
>>>>>> squid-users at lists.squid-cache.org
>>>>>> http://lists.squid-cache.org/listinfo/squid-users
>>>>>>
>>>>> _______________________________________________
>>>>> squid-users mailing list
>>>>> squid-users at lists.squid-cache.org
>>>>> http://lists.squid-cache.org/listinfo/squid-users
>>>>
>>>>
>>>>
>>>>
>>> _______________________________________________
>>> squid-users mailing list
>>> squid-users at lists.squid-cache.org
>>> http://lists.squid-cache.org/listinfo/squid-users
>>
>>
>>
> _______________________________________________
> squid-users mailing list
> squid-users at lists.squid-cache.org
> http://lists.squid-cache.org/listinfo/squid-users