[squid-users] Rock datastore, CFLAGS and a crash that (may be) known

Thu Feb 18 12:02:34 UTC 2016

On 18/02/2016 5:21 p.m., Jester Purtteman wrote:
> 
> 
>> -----Original Message-----
>> From: Amos Jeffries [mailto:squid3 at treenet.co.nz]
>> Sent: Wednesday, February 17, 2016 8:13 AM
>> To: squid-users at lists.squid-cache.org
>> Cc: Jester Purtteman <jester at optimera.us>
>> Subject: Re: [squid-users] Rock datastore, CFLAGS and a crash that (may be)
>> known
>>
>> On 18/02/2016 2:36 a.m., Jester Purtteman wrote:
>>>
>>>>> cache_dir rock /var/spool/squid/rock/1 64000 swap-timeout=600
>>>>> max-swap-rate=600 min-size=0 max-size=128KB
>>>>>
>>>>> cache_dir rock /var/spool/squid/rock/2 102400 swap-timeout=600
>>>>> max-swap-rate=600 min-size=128KB max-size=256KB
>>>>>
>>>>> cache_dir aufs /var/spool/squid/aufs/1 200000 16 128 min-size=256KB
>>>>> max-size=4096KB
>>>>>
>>>>> cache_dir aufs /var/spool/squid/aufs/2 1500000 16 128
>>>>> min-size=4096KB max-size=8196000KB
>>>>
>>>> NP: don't forget to isolate the AUFS cache_dir within each worker.
>>>> Either with the squid.conf if-else-endif syntax, or ${process_id} macros.
>>>
>>
<snip>
>>
>> With the Disker model though you dont have to bother. All workers take
>> their HTTP message and pass it expicitly to the one Disker process that has
>> that object cached.
>> This is very similar in behaviour to how the CARP caching algorithm behaves
>> with a frontend/worker handling the HTTP messages and backend/Disker
>> doing the caching. But a huge amount more efficiently than a 2-layer CARP
>> setup.
>>
> 
> So, my only concern had been that I would end up with N workers and N
> separate datastores (with duplicated data).  Am I understanding that
> I can have several workers that may potentially share a single
> disker? (I am guessing there is a delay involved if two workers hit
> the same cache dir disker at the same time).  Or is there a need to
> do two datastores (that will have overlapping cache entries) and two
> diskers if I want two workers?  The latter (needing redundant
> datastores) is how I had understood that to work.  If I am wrong,
> then what am I waiting for?  Let me know, I think I may be seeing
> what I want to see in this case.
> 

"workers N" initiates N workers for handling HTTP messages.
Each "cache_dir rock" entry initiates just one Disker process which is
in charge of the objects coming and going to that cache_dir.

The workers are all aware of the Diskers and request data from the
Disker be loaded via shared memory as needed to answer an HTTP message
being handled by the worker.

> 
>>
>>> I'm hoping to be able to cache big files a little more efficiently,
>>> currently they can lead to little seizures while squid opens a 2-gb
>>> file into memory, seizures that update software doesn't notice but end
>>> users do. I'm thinking isolation may be the answer there.
>>> Just thinking out loud, I'll grep the docs tonight and think about
>>> that more, pointers welcome.
>>
>> If I'm understanding you right the "seizure" you speak of is the (sometimes
>> large) jitter Squid intoduces to packets and event processing delays when a
>> large object is being transferred.
>>
>> What we know about that is that the memory handling algorithm is very
>> inefficient in how it looks up the next bit of a stored object to send.
>> Even though Squid-3 is much better than Squid-2 it still has issues in the area.
>> That affects Squid just relaying any large object and is not particularly related
>> to the caching of them.
>>
>>
>> If you are allowing Squid to move very large objects into memory storage (via
>> maximum_object_size_in_memory) that can cause unnecessary hiccups.
>> Just reduce the max size directive and it should even out.
>>
> 
> Yeah, I actually think I might just set that up with no-cache, and
> setup a separate squid proxy for Very Big Files (well, sites with a
> high probability of generating Very Big Files, like windows update,
> apple update etc) and let THAT lock up.  Then I can set my cache size
> for the other 99.97% of the internet to something more reasonable,
> and that may fix the trouble I have had.
> 
> The big glitch is windows updates, those blasted things are FRIGGEN
> HUGE, and seem like quite the wasteful thing to pull over a satellite
> twice if it can be helped.  I have 800 users now, and I suspect a
> good fraction of all my traffic is just updates being sent and resent
> and resent.
> 

Nod. There was a thread not long ago where two of us tried to figure out
how best to handle those. The old strategy for dealing with Win7 and
older has kind of stopped working with the big Win10 updates and
forced-upgrades going on.
We got no good results yet AFAIK.

<snip>
> What actually seems to have induced a crash is having over 96 gb of
> rock store specified.  Whenever I bump that up to 128GB or beyond, it
> crashes with the error that started me out on this wild goose chase.
> The following line is actually what caused the crash, I just didn't
> catch that I changed both of them at once.  Thoughts?
> 
> cache_dir rock /var/spool/squid/rock/1 32000 swap-timeout=600 max-swap-rate=600 min-size=0 max-size=64KB
> cache_dir rock /var/spool/squid/rock/2 128000 swap-timeout=600 max-swap-rate=600 min-size=64KB max-size=256KB
> 

Those "KB" units are not understood by Squid. Will just be ignored at
present since they are not digits.

> Changing the size of cache_dir number 2 to 64000 stops it from
> crashing.  Adding a third 64000 rock store causes a crash as well.
> The disk itself is 180 gb formatted capacity.  I'll turn up the
> debugging and see if it says anything interesting.
> 

Just a guess, but I suspect this is related to either some limit in your
shared-memory size (SMP rock cache needs lots of shared memory for the
cache index).

Or other bugs may be possible. Maybe related to the objects size issue
above.

When you are creating a rock cache for large objects IIRC you should
increase the default slot size to reduce the number of slots per
objects. You get a bit of wastage in the last slot. But relative to the
total object size that should not be much to pay. Rock defaults are
tuned to small under-32KB objects.

Amos