[squid-users] High cpu usage by re_search_internal

Amos Jeffries squid3 at treenet.co.nz
Sat Oct 4 15:23:49 UTC 2014


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 5/10/2014 4:12 a.m., Amos Jeffries wrote:
> On 5/10/2014 3:34 a.m., Omid Kosari wrote:
>> Mehdi Sarmadi wrote
>>> Hey
>>> 
>>> Alright. About refresh pattern you have a very excessive list 
>>> IMHO. I don't know about your hardware but generally for a 
>>> typical general purpose SMB server hardware, that's too much.
>>> If you want to stick with it and can't reduce the list. Check,
>>> how many core's you machine have. You should know squid
>>> naturally sticks . A solution is to start multiple squid
>>> instances, that way you can have squid(refresh_pattern) load
>>> distributed on more than one CPU core, thus you'll get better
>>> performance.
>>> 
>>> Hope it helps Cheers
> 
>> Thanks for the tip . It has a core i3 cpu so it has 4 cores . 
>> Unfortunately squid does not load fine across all cores
>> specially in older versions like mine v 3.1 . Multi instance has
>> its  own complexity and headaches . i am trying to have clean
>> design to be away from those problems . It was very useful if
>> squid could do the refresh_pattern jobs by other cores . or some
>> trick like that .
> 
> 
> Here are some tips for your patterns:
> 
> * all the (.+\.)? at the begining are useless complication.
> Remove.
> 
> * so are the .*?$ at the end of some patterns. This bit is also 
> probably doing more harm than good. Because the $ hints to regex
> that it should scan right-to-left and the path and query portions
> of the URL is the largest pieces to scan over. Remove.
> 
> * the following two lines are redundant. The first will match 
> everything the second would have caught. Drop the second one.
> 
> refresh_pattern -i \.htm 120 50% 10080 reload-into-ims 
> refresh_pattern -i \.html 120 50% 10080 reload-into-ims
> 
> * the pattern below the comment "#Very aggressive 120 Days"
> contains duplicates.
> 
> 
> there are probably some smaller fixes, but those are the biggest I
> can see without suggesting you drop those patterns entirely.
> 
> You would do well from an upgrade of Squid. The later versions
> have eliminated the need for ignore-no-cache, ignore-private,
> ignore-auth (the latter two there do really, really bad things).
> 
> I am also curios why you are ignoring must-revalidate? it is a 
> bandwidth reduction mechanism.
> 

I'm finding some more the more I look.

Things like (10\.10\.34\.34|peyvandha\.ir) being in the pattern set at
the top means that the section of pattern later
(10\.10\.34\.34|peyvandha\.ir|  in the 120 day set is useless.

Similar things in the "All files" set.
 *  rar for example is listed twice,
 *  (jp(e?g|e|2)|jpg  matches jpg ... or jpg.

Also when matching one character from multiple use square bracket
syntax. Instead of things like ms(i|u|p)   ... make it:  ms[iup]


Amos
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (MingW32)

iQEcBAEBAgAGBQJUMBEEAAoJELJo5wb/XPRjjxoH/RcYxK+rRSwdUqbetPJOdtMf
e0ApMl2N6pc+dD8HpY5ZcAB5rSOSEepinETVgpeOI33mDT7H4m7NlUvLStHBTzan
SFWpoGWICUDiDwh0I/nMJTBByz8la073Rw7uKwl0AL2j3P9WtJoPyd4J8pSbs+BD
5YcQyJU0nWxlx05qbayl7Fe1R3wCWwA3xWtLTXnxEqnfQ+u69g4o+0XWSh7A0tC9
WVcl1BE4GaBXjSFKdkx1waR6ZIXeaEuI0GRSe57MCVk2e5P/nkLGwBvepw/dr7YN
nHbeC4LI/vOKhS0yZVsa5z6y10Ov2vLTnAf2kXgpA2Ud7lZY72MAGYbhJNnWNiU=
=y94q
-----END PGP SIGNATURE-----


More information about the squid-users mailing list