[squid-users] Squid crash - 3.5.21

Alex Rousskov rousskov at measurement-factory.com
Tue Nov 12 14:15:20 UTC 2019


FTR: hindsight1 runs v4.8 despite the email subject saying "3.5.21".


On 11/11/19 9:34 PM, hindsight1 wrote:

> thank you for your reply

Thank you for detailing the problem. The best place to discuss these
low-level details is Squid Bugzilla. I suggest that you open a new bug
report there. If you want to continue here, then the primary remaining
question for me is _why_ theLevels array elements are misaligned in your
tests:

* theLevels[0]: The segments are allocated on page-boundaries so the
first level element (i.e. level[0]) should be properly aligned. I
believe your stack trace shows that this zero-offset access is misaligned.

* theLevels[n+1]: The levels array is declared using regular C++
constructs, without any casting, so subsequent elements should be
aligned properly if the first element is aligned properly.

So where does this misalignment originate from? Properly addressing this
bug probably requires answering this question.


Please note that there are GCC v4 bugs that might be relevant here:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62259
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65147

What is your compiler? Does building with GCC v5.1 fix the problem?


BTW, there is a potentially useful alignas() workaround/trick shown at
https://stackoverflow.com/questions/26703297/alignment-of-atomic-variables



Thank you,

Alex.


> Sorry for my English,First of all,
>> I am not sure whether "it" in your sentence refers to gdb or Squid, but
>> if Squid dereferences an unaligned data field in shared memory, then it
>> is most likely a Squid bug.
> 
> "it" is refers to Squid.
> 
> i want to Explain my problem again 
> 
> 
> I run Squid4.8 using SMP mode on the Arm64 platform. When setting some
> worker numbers, for example 4 or 7,9 the same error Received Bus
> Error...dying appears in the log.
> Using gdb debugging, I found an error when accessing theLevels variable in
> the Ipc::Mem::PagePool::level function, due to the non-aligned address
> access caused by the atomic operation load.
> Here is stacktrace:
> 
> #0  0x0000ffff9c945228 in raise () from /lib64/libc.so.6
> #1  0x0000ffff9c9468a0 in abort () from /lib64/libc.so.6
> #2  0x00000000007c6238 in death (sig=7) at tools.cc:359
> #3  <signal handler called>
> #4  0x00000000007c50a4 in std::__atomic_base<unsigned long>::load
> (this=0xfff9d9d40cec, __m=std::memory_order_seq_cst) at
> /usr/include/c++/4.8.2/bits/atomic_base.h:496
> #5  0x00000000007c4344 in std::__atomic_base<unsigned long>::operator
> unsigned long (this=0xfff9d9d40cec) at
> /usr/include/c++/4.8.2/bits/atomic_base.h:367
> #6  0x0000000000937f2c in Ipc::Mem::PagePool::level (this=0xde9150,
> purpose=0) at mem/PagePool.cc:46
> #7  0x0000000000934ae8 in Ipc::Mem::PageLevel (purpose=0) at mem/Pages.cc:88
> #8  0x00000000007c3db0 in Ipc::Mem::PagesAvailable (purpose=0) at
> ipc/mem/Pages.h:51
> #9  0x00000000009342a4 in Ipc::Mem::GetPage
> (purpose=Ipc::Mem::PageId::cachePage, page=...) at mem/Pages.cc:36
> #10 0x00000000007c241c in MemStore::reserveSapForWriting (this=0x10e3bb0,
> page=...) at MemStore.cc:778
> #11 0x00000000007c1c18 in MemStore::nextAppendableSlice (this=0x10e3bb0,
> fileNo=513735, sliceOffset=@0x10e3bd8: -1) at MemStore.cc:731
> #12 0x00000000007c145c in MemStore::copyToShm (this=0x10e3bb0, e=...) at
> MemStore.cc:682
> #13 0x00000000007c2cdc in MemStore::write (this=0x10e3bb0, e=...) at
> MemStore.cc:856
> #14 0x00000000009f9838 in Store::Controller::memoryOut (this=0xdd2e20,
> e=..., preserveSwappable=true) at Controller.cc:550
> #15 0x00000000007b50e8 in StoreEntry::swapOut (this=0x125e4e0) at
> store_swapout.cc:175
> #16 0x00000000007af4a4 in StoreEntry::invokeHandlers (this=0x125e4e0) at
> store_client.cc:720
> #17 0x00000000007a650c in StoreEntry::flush (this=0x125e4e0) at
> store.cc:1674
> #18 0x00000000007a6dcc in StoreEntry::startWriting (this=0x125e4e0) at
> store.cc:1844
> #19 0x000000000083b7e8 in Client::setFinalReply (this=0x1275b18,
> rep=0x12b36c0) at Client.cc:164
> #20 0x00000000008401bc in Client::adaptOrFinalizeReply (this=0x1275b18) at
> Client.cc:974
> #21 0x0000000000726344 in HttpStateData::processReply (this=0x1275b18) at
> http.cc:1246
> #22 0x0000000000725fec in HttpStateData::readReply (this=0x1275b18, io=...)
> at http.cc:1223
> #23 0x000000000072fc10 in CommCbMemFunT<HttpStateData,
> CommIoCbParams>::doDial (this=0x1285f80) at CommCalls.h:205
> #24 0x0000000000730070 in JobDialer<HttpStateData>::dial (this=0x1285f80,
> call=...) at base/AsyncJobCalls.h:174
> #25 0x000000000072f728 in AsyncCallT<CommCbMemFunT<HttpStateData,
> CommIoCbParams> >::fire (this=0x1285f50) at ../src/base/AsyncCall.h:145
> #26 0x0000000000887728 in AsyncCall::make (this=0x1285f50) at
> AsyncCall.cc:40
> #27 0x0000000000888270 in AsyncCallQueue::fireNext (this=0xe11e80) at
> AsyncCallQueue.cc:56
> #28 0x0000000000888068 in AsyncCallQueue::fire (this=0xe11e80) at
> AsyncCallQueue.cc:42
> #29 0x00000000006f4948 in EventLoop::dispatchCalls (this=0xfffff17422e8) at
> EventLoop.cc:144
> #30 0x00000000006f485c in EventLoop::runOnce (this=0xfffff17422e8) at
> EventLoop.cc:121
> #31 0x00000000006f46a0 in EventLoop::run (this=0xfffff17422e8) at
> EventLoop.cc:83
> #32 0x000000000076090c in SquidMain (argc=4, argv=0xfffff1742638) at
> main.cc:1709
> #33 0x000000000075fe70 in SquidMainSafe (argc=4, argv=0xfffff1742638) at
> main.cc:1417
> #34 0x000000000075fe3c in main (argc=4, argv=0xfffff1742638) at main.cc:1405
> 
> When theLevels is initialized, the assigned value is a multiple of 4 and is
> not aligned with the unsigned long.
> Atomic operations on arm only support aligned address access.
> This problem has not been encountered on the x86 platform.
> 
> 
> In the previous reply, there was another question that was not answered.
> 
>> I tried another way to ensure that the theCapacity  parameter passed in to
>> an even number also solves this problem.
> 
> Keep the item type uint32_t, modify the NoteMemoryNeeds method in the
> IpcIoFile.cc file in NotePageNeed as follows
> 
>  Ipc:: Mem:: NotePageNeed(Ipc::Mem::PageId::ioPage,static_cast
> <int>(itemsCount * 1.1)+ static_cast < Int>(itemsCount * 1.1)%2); 
> 
> 
> This modification ensures that the value obtained by the theCapacity
> parameter is even,which also solves this problem.
> 
> 
> After your analysis, please evaluate whether this method is reasonable,thank
> you.
> 
> 
> Regards,
> hindsight



More information about the squid-users mailing list