[squid-users] Too many ERROR: Collapsed forwarding queue overflow for kid2 at 1024 items

Loučanský Lukáš Loucansky.Lukas at kjj.cz
Tue Nov 16 09:38:12 UTC 2021


Ok - but is it going to be patched only in the v6 version? I've just grabbed  squid-5.2-20211028-r74f073ecd and (for example) ReadWriteLock.cc does not look like in the patch (no finalizeExclusive() etc.)

Anyway - in the morning I run debug with 20,9 to see:

2021/11/16 09:02:06.459 kid2| ctx: enter level  0: 'http://i5.c.eset.com/v1/auth/4574614B3E4735238620/updlist/9/eid/1041372/lid/1041535'
2021/11/16 09:02:06.459 kid2| 20,7| store.cc(1431) timestampsSet: e:tw3490=p2VC/0x5620c7a7b000*3 had LV:-1        LU:1637049726 LM:-1        EX:-1
2021/11/16 09:02:06.459 kid2| 20,5| store.cc(1489) timestampsSet: e:tw3490=p2VC/0x5620c7a7b000*3 has LV:1637049726 LU:1637049726 LM:1637049726 EX:1637049736
2021/11/16 09:02:06.459 kid2| 20,3| Controller.cc(441) peek: 4F9D54687B6F14B553B031EA2D48E686
2021/11/16 09:02:06.459 kid2| 20,3| Controller.cc(449) peek: got local in-transit entry: e:tw3490=p2VC/0x5620c7a7b000*3
2021/11/16 09:02:06.459 kid2| 20,3| store.cc(443) lock: HttpStateData::haveParsedReplyHeaders locked key 4F9D54687B6F14B553B031EA2D48E686 e:tw3490=p2VC/0x5620c7a7b000*4
2021/11/16 09:02:06.459 kid2| 20,3| store.cc(467) unlock: HttpStateData::haveParsedReplyHeaders unlocking key 4F9D54687B6F14B553B031EA2D48E686 e:tw3490=p2VC/0x5620c7a7b000*4
2021/11/16 09:02:06.460 kid2| 20,3| store.cc(596) setPublicKey: e:tw3490=p2VC/0x5620c7a7b000*3
2021/11/16 09:02:06.460 kid2| ctx: exit level  0
2021/11/16 09:02:06.460 kid2| 20,5| store.cc(815) write: storeWrite: writing 300 bytes for '4F9D54687B6F14B553B031EA2D48E686'
2021/11/16 09:02:06.460 kid2| 20,7| Controller.cc(537) memoryCacheHasSpaceFor: 1: 87+1?1038435
2021/11/16 09:02:06.460 kid2| 20,2| store.cc(991) checkCachable: StoreEntry::checkCachable: NO: too small
2021/11/16 09:02:06.460 kid2| 20,3| store.cc(455) releaseRequest: 0 e:tw3490=p2VC/0x5620c7a7b000*3
2021/11/16 09:02:06.460 kid2| 20,3| store.cc(569) setPrivateKey: 01 e:tw3490=p2VC/0x5620c7a7b000*3
2021/11/16 09:02:06.460 kid2| 20,7| Controller.cc(504) evictCached: e:tw3490=p2XVC/0x5620c7a7b000*3
2021/11/16 09:02:06.460 kid2| 20,5| Transients.cc(347) evictCached: e:tw3490=p2XVC/0x5620c7a7b000*3
2021/11/16 09:02:06.460 kid2| 20,3| store.cc(421) hashInsert: StoreEntry::hashInsert: Inserting Entry e:tw3490=p2XIVC/0x5620c7a7b000*3 key '4F710300000000008338000002000000'
2021/11/16 09:02:06.460 kid2| 20,3| store_swapout.cc(379) mayStartSwapOut: not cachable
2021/11/16 09:02:06.460 kid2| 20,7| store.cc(1833) transientsAbandonmentCheck: cannot be shared: e:tw3490=p2XIVC/0x5620c7a7b000*3
2021/11/16 09:02:06.460 kid2| 20,5| Transients.cc(347) evictCached: e:tw3490=p2XIVC/0x5620c7a7b000*3
2021/11/16 09:02:06.460 kid2| 20,7| MemStore.cc(828) write: entry e:tw3490=p2XIVC/0x5620c7a7b000*3
2021/11/16 09:02:06.460 kid2| 20,2| store.cc(967) checkCachable: StoreEntry::checkCachable: NO: not cachable
2021/11/16 09:02:06.460 kid2| 20,7| MemStore.cc(585) shouldCache: Not memory cachable: e:tw3490=p2XIVC/0x5620c7a7b000*3
2021/11/16 09:02:06.460 kid2| 20,7| store.cc(1833) transientsAbandonmentCheck: cannot be shared: e:tw3490=p2XIVC/0x5620c7a7b000*3
2021/11/16 09:02:06.460 kid2| 20,5| Transients.cc(347) evictCached: e:tw3490=p2XIVC/0x5620c7a7b000*3
2021/11/16 09:02:06.460 kid2| 20,7| Controller.cc(609) memoryOut: keepInLocalMemory: 0
2021/11/16 09:02:06.460 kid2| 20,3| store.cc(443) lock: ClientHttpRequest::loggingEntry locked key 4F710300000000008338000002000000 e:tw3490=p2XIVC/0x5620c7a7b000*4
2021/11/16 09:02:06.460 kid2| 20,5| store.cc(815) write: storeWrite: writing 62 bytes for '4F710300000000008338000002000000'
2021/11/16 09:02:06.460 kid2| 20,7| Controller.cc(537) memoryCacheHasSpaceFor: 1: 88+1?1038435
2021/11/16 09:02:06.460 kid2| 20,3| store_swapout.cc(347) mayStartSwapOut:  already rejected
2021/11/16 09:02:06.460 kid2| 20,7| MemStore.cc(828) write: entry e:tw3490=p2XIV/0x5620c7a7b000*4
2021/11/16 09:02:06.460 kid2| 20,7| Controller.cc(609) memoryOut: keepInLocalMemory: 0
2021/11/16 09:02:06.461 kid2| 20,3| store.cc(1061) complete: storeComplete: '4F710300000000008338000002000000'
2021/11/16 09:02:06.461 kid2| 20,3| store.cc(1244) validLength: storeEntryValidLength: Checking '4F710300000000008338000002000000'
2021/11/16 09:02:06.461 kid2| 20,5| store.cc(1246) validLength: storeEntryValidLength:     object_len = 362
2021/11/16 09:02:06.461 kid2| 20,5| store.cc(1247) validLength: storeEntryValidLength:         hdr_sz = 300
2021/11/16 09:02:06.461 kid2| 20,5| store.cc(1248) validLength: storeEntryValidLength: content_length = 62
2021/11/16 09:02:06.461 kid2| 20,3| store_swapout.cc(347) mayStartSwapOut:  already rejected
2021/11/16 09:02:06.461 kid2| 20,7| MemStore.cc(828) write: entry e:tw3490=sp2XIV/0x5620c7a7b000*4
2021/11/16 09:02:06.461 kid2| 20,7| Controller.cc(609) memoryOut: keepInLocalMemory: 0
2021/11/16 09:02:06.461 kid2| 20,3| store.cc(467) unlock: Client unlocking key 4F710300000000008338000002000000 e:tw3490=sp2XIV/0x5620c7a7b000*4
2021/11/16 09:02:06.461 kid2| 20,3| store.cc(467) unlock: FwdState unlocking key 4F710300000000008338000002000000 e:tw3490=sp2XIV/0x5620c7a7b000*3
2021/11/16 09:02:06.461 kid2| 20,3| store.cc(443) lock: store_client::copy locked key 4F710300000000008338000002000000 e:tw3490=sp2XIV/0x5620c7a7b000*3
2021/11/16 09:02:06.461 kid2| 20,3| store.cc(467) unlock: store_client::copy unlocking key 4F710300000000008338000002000000 e:tw3490=sp2XIV/0x5620c7a7b000*3
2021/11/16 09:02:06.461 kid2| 20,3| store_swapout.cc(347) mayStartSwapOut:  already rejected
2021/11/16 09:02:06.461 kid2| 20,7| MemStore.cc(828) write: entry e:tw3490=sp2XIV/0x5620c7a7b000*2
2021/11/16 09:02:06.461 kid2| 20,7| Controller.cc(609) memoryOut: keepInLocalMemory: 0
2021/11/16 09:02:06.461 kid2| 20,3| store.cc(443) lock: storeUnregister locked key 4F710300000000008338000002000000 e:tw3490=sp2XIV/0x5620c7a7b000*3
2021/11/16 09:02:06.461 kid2| 20,3| store.cc(467) unlock: storeUnregister unlocking key 4F710300000000008338000002000000 e:tw3490=sp2XIV/0x5620c7a7b000*3
2021/11/16 09:02:06.461 kid2| 20,3| store.cc(467) unlock: clientReplyContext::removeStoreReference unlocking key 4F710300000000008338000002000000 e:tw3490=sp2XIV/0x5620c7a7b000*2
2021/11/16 09:02:06.461 kid2| 20,3| store.cc(467) unlock: ClientHttpRequest::loggingEntry unlocking key 4F710300000000008338000002000000 e:tw3490=sp2XIV/0x5620c7a7b000*1
2021/11/16 09:02:06.461 kid2| 20,5| store.cc(483) doAbandon: e:tw3490=sp2XIV/0x5620c7a7b000*0 via ClientHttpRequest::loggingEntry
2021/11/16 09:02:06.461 kid2| 20,3| store.cc(1180) release: 0 e:tw3490=sp2XIV/0x5620c7a7b000*0 4F710300000000008338000002000000
2021/11/16 09:02:06.461 kid2| 20,7| Controller.cc(504) evictCached: e:tw3490=sp2XIV/0x5620c7a7b000*0
2021/11/16 09:02:06.461 kid2| 20,5| Transients.cc(347) evictCached: e:tw3490=sp2XIV/0x5620c7a7b000*0
2021/11/16 09:02:06.461 kid2| 20,3| store.cc(399) destroyStoreEntry: destroyStoreEntry: destroying 0x5620c7a7b008
2021/11/16 09:02:06.461 kid2| 20,3| store.cc(381) destroyMemObject: 0x5620ae385100 in e:tw3490=sp2XIV/0x5620c7a7b000*0
2021/11/16 09:02:06.461 kid2| 20,5| Transients.cc(372) disconnect: e:tw3490=sp2XIV/0x5620c7a7b000*0
2021/11/16 09:02:06.461 kid2| 20,3| MemObject.cc(108) ~MemObject: MemObject destructed, this=0x5620ae385100
2021/11/16 09:02:06.461 kid2| 20,5| store.cc(354) ~StoreEntry: StoreEntry destructed, this=0x5620c7a7b000
2021/11/16 09:02:06.488 kid2| 20,3| store.cc(1061) complete: storeComplete: '4E710300000000008338000002000000'
2021/11/16 09:02:06.488 kid2| 20,3| store.cc(1244) validLength: storeEntryValidLength: Checking '4E710300000000008338000002000000'
2021/11/16 09:02:06.488 kid2| 20,5| store.cc(1246) validLength: storeEntryValidLength:     object_len = 0
2021/11/16 09:02:06.488 kid2| 20,5| store.cc(1247) validLength: storeEntryValidLength:         hdr_sz = 0
2021/11/16 09:02:06.488 kid2| 20,5| store.cc(1248) validLength: storeEntryValidLength: content_length = -1
2021/11/16 09:02:06.488 kid2| 20,5| store.cc(1251) validLength: storeEntryValidLength: Unspecified content length: 4E710300000000008338000002000000
2021/11/16 09:02:06.488 kid2| 20,2| store.cc(991) checkCachable: StoreEntry::checkCachable: NO: too small
2021/11/16 09:02:06.488 kid2| 20,3| store.cc(455) releaseRequest: 0 e:=sp2IV/0x5620ef474d90*2
2021/11/16 09:02:06.488 kid2| 20,3| store.cc(569) setPrivateKey: 01 e:=sp2IV/0x5620ef474d90*2
2021/11/16 09:02:06.488 kid2| 20,3| store_swapout.cc(379) mayStartSwapOut: not cachable
2021/11/16 09:02:06.488 kid2| 20,7| MemStore.cc(828) write: entry e:=sp2XIV/0x5620ef474d90*2
2021/11/16 09:02:06.488 kid2| 20,2| store.cc(967) checkCachable: StoreEntry::checkCachable: NO: not cachable
2021/11/16 09:02:06.488 kid2| 20,7| MemStore.cc(585) shouldCache: Not memory cachable: e:=sp2XIV/0x5620ef474d90*2
2021/11/16 09:02:06.488 kid2| 20,7| Controller.cc(609) memoryOut: keepInLocalMemory: 0
2021/11/16 09:02:06.488 kid2| 20,3| store.cc(467) unlock: FwdState unlocking key 4E710300000000008338000002000000 e:=sp2XIV/0x5620ef474d90*2
2021/11/16 09:02:06.496 kid2| 20,3| Controller.cc(441) peek: 1E955E11E1FC7030D6ABDD39209EB79E
2021/11/16 09:02:06.496 kid2| 20,5| store.cc(349) StoreEntry: StoreEntry constructed, this=0x5620d722e560
2021/11/16 09:02:06.496 kid2| 20,3| MemObject.cc(100) MemObject: MemObject constructed, this=0x5620ef479200
2021/11/16 09:02:06.496 kid2| 20,7| Disks.cc(232) get: cache_dir 0 has: e:d1063200 at 0=sw2V/0x5620d722e560*0
2021/11/16 09:02:06.496 kid2| 20,3| Controller.cc(470) peek: got disk-cached entry: e:d1063200 at 0=sw2V/0x5620d722e560*0
2021/11/16 09:02:06.496 kid2| assertion failed: Transients.cc:221: "old == e"

In fact - I don't even know if it's related to the eset files. I have refresh_pattern only for u.*\.eset\.com.*\/ and nothing for eset in the storeid_rewrite patterns.

LL


-----Original Message-----
From: Alex Rousskov [mailto:rousskov at measurement-factory.com] 
Sent: Monday, November 15, 2021 5:17 PM
To: Squid Users
Cc: Loučanský Lukáš
Subject: Re: [squid-users] Too many ERROR: Collapsed forwarding queue overflow for kid2 at 1024 items

On 11/15/21 7:43 AM, Loučanský Lukáš wrote:

> 2021/11/14 10:13:30 kid2| assertion failed: Transients.cc:221: "old == e"
> 2021/11/15 08:37:36 kid2| assertion failed: Transients.cc:221: "old == e"
> 2021/11/15 11:54:14 kid1| assertion failed: Transients.cc:221: "old == e"
> 2021/11/15 12:16:27 kid1| assertion failed: Transients.cc:221: "old == e"

I recommend ignoring queue overflows until the above assertions are fixed because worker deaths cause queue overflows. Your Squid is buggy, and those bugs essentially cause queue overflows.

The assertion itself is known as Bug 5134:
https://bugs.squid-cache.org/show_bug.cgi?id=5134

That bug has a speculative fix (master/v6 commit 5210df4). Please try it if you can.


HTH,

Alex.


> -----Original Message-----
> From: Alex Rousskov [mailto:rousskov at measurement-factory.com]
> Sent: Friday, November 12, 2021 5:24 PM
> To: Loučanský Lukáš; squid-users at lists.squid-cache.org
> Subject: Re: [squid-users] Too many ERROR: Collapsed forwarding queue 
> overflow for kid2 at 1024 items
> 
> On 11/11/21 10:19 AM, Loučanský Lukáš wrote:
> 
>> recently I'm facing too many ERROR: Collapsed forwarding queue 
>> overflow for kid2 at 1024 items lines in my Squid 5.2 log files.
> 
> We see those overflows when kids die. Do you see any FATAL messages, assertions, or similar deadly errors in cache.log?
> 
> 
>> Could someone elaborate how the queue is filled - what is clogging it?
> 
> The sender/writer sends messages faster than the recipient/reader is 
> reading them, eventually exceeding the queue capacity (i.e. 1024 
> messages). These messages are about Store entries that may need 
> synchronization across workers. Each message is very sm
all.
> 
> 
>> I don't mind too much if I have to turn collapsed forwarding off
> 
> Most likely, the problem is not tied to collapsed forwarding. These 
> queues were used for collapsed forwarding when they were added, but 
> they are used for regular traffic as well in modern SMP Squids. We 
> need to change the queue names (and related code/m
essage text) to reflect the expanded nature of these queues.
> 
> 
> HTH,
> 
> Alex.
> 



More information about the squid-users mailing list