<div dir="ltr"><div><div><div><div>Marcus, tnx for your info.<br></div>OS is centos 6 w kernel  2.6.32-504.30.3.el6.x86_64<br></div>Yes, cpu_affinity_map is good and with 6 instances there is load only on first 6 cores and the server is 12 core, 24 HT<br></div>each instance is bound to 1 core. Instance 1 = core1, instance 2 = core 2 and so on so that should not be the problem.<br></div>I've tried with 12 workers but that's even worse. <br><div><br></div><div>Let me try to explain:<br></div><div>on non-smp with traffic at ~300mbits we have load of ~4 (on 6 workers).<br></div><div>in that case, actual user time is about 10-20% and 70-80% is sys time (osq_lock) and there are no connection timeouts.<br><br></div><div>If I switch to SMP 6 workers user time goes up but sys time goes up too and there are connection timeouts and the load jumps to ~12.<br></div><div>If I give it more workers only load jumps and more connections are being dropped to the point that load goes to 23/24 and the entire server is slow as hell.<br><br></div><div>So, best performance so far are with 6 non-smp workers.<br><br></div><div>For now I have 2 options:<br></div><div>1. Install older squid (3.1.10 centos repo) and try it then<br></div><div>2. build custom 64bit kernel with RCU and specific cpu family support (in progress).<br></div><div><br></div><div class="gmail_extra">The end idea is to be able to sustain 1gig of traffic on this server :)<br></div><div class="gmail_extra">Any advice is welcome<br></div><div class="gmail_extra"><br><br></div><div class="gmail_extra">J.<br></div><div class="gmail_extra"><br><div class="gmail_quote">2015-07-31 14:53 GMT+02:00 Marcus Kool <span dir="ltr"><<a href="mailto:marcus.kool@urlfilterdb.com" target="_blank">marcus.kool@urlfilterdb.com</a>></span>:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">osq_lock is used in the kenel for the implementation of a mutex.<br>

It is not clear which mutex so we can only guess.<br>

<br>

Which version of the kernel and distro do you use?<br>

<br>

Since mutexes are used by Squid SMP, I suggest to switch for now to Squid non-SMP.<br>

<br>

What is the value of cpu_affinity_map in all config files?<br>

You say they are static. But do you allocate each instance on a different core?<br>

Does 'top' show that all CPUs are used?<br>

<br>

Do you have 24 cores or 12 hyperthreaded cores?<br>

In case you have 12 real cores, you might want to experiment with 12 instances of Squid and then try to upscale.<br>

<br>

Make maximum_object_size large, a max size of 16K will prohibit the retrieval of objects larger than 16K.<br>

I am not sure about 'maximum_object_size_in_memory 16 KB' but let it be infinite and do not worry since<br>

cache_mem is zero.<br>

<br>

Marcus<div><div><br>

<br>

<br>

On 07/31/2015 03:52 AM, Josip Makarevic wrote:<br>

</div></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div><div>

Hi Amos,<br>

<br>

  cache_mem 0<br>

  cache deny all<br>

<br>

already there.<br>

Regarding number of nic ports we have 4 10G eth cards 2 in each bonding interface.<br>

<br>

Well, entire config would be way too long but here is the static part:<br>

via off<br>

cpu_affinity_map process_numbers=1 cores=2<br>

forwarded_for delete<br>

visible_hostname squid1<br>

pid_filename /var/run/squid1.pid<br>

icp_port 0<br>

htcp_port 0<br>

icp_access deny all<br>

htcp_access deny all<br>

snmp_port 0<br>

snmp_access deny all<br>

dns_nameservers x.x.x.x<br>

cache_mem 0<br>

cache deny all<br>

pipeline_prefetch on<br>

memory_pools off<br>

maximum_object_size 16 KB<br>

maximum_object_size_in_memory 16 KB<br>

ipcache_size 0<br>

cache_store_log none<br>

half_closed_clients off<br>

include /etc/squid/rules<br>

access_log /var/log/squid/squid1-access.log<br>

cache_log /var/log/squid/squid1-cache.log<br>

coredump_dir /var/spool/squid/squid1<br>

refresh_pattern ^ftp:           1440    20%     10080<br>

refresh_pattern ^gopher:        1440    0%      1440<br>

refresh_pattern -i (/cgi-bin/|\?) 0     0%      0<br>

refresh_pattern .               0       20%     4320<br>

<br>

acl port0 myport 30000<br>

http_access allow testhost<br>

tcp_outgoing_address x.x.x.x port0<br>

<br>

include is there for basic ACL - safe ports and so on - to minimize config file footprint since it's static and same for every worker.<br>

<br>

and so on 44 more times in this config file<br>

<br>

Do you know of any good article hot to tune kernel locking or have any idea why is it happening?<br>

I cannot find any good info on it and all I've found are bits and peaces of kernel source code.<br>

<br>

<br>

Tnx.<br>

J.<br>

<br></div></div>

2015-07-31 0:42 GMT+02:00 Amos Jeffries <<a href="mailto:squid3@treenet.co.nz" target="_blank">squid3@treenet.co.nz</a> <mailto:<a href="mailto:squid3@treenet.co.nz" target="_blank">squid3@treenet.co.nz</a>>>:<div><div><br>

<br>

    On 31/07/2015 8:05 a.m., Josip Makarevic wrote:<br>

    > Hi,<br>

    ><br>

    > I have a problem with squid setup (squid version 3.5.6, built from source,<br>

    > centos 6.6)<br>

    > I've tried 2 options:<br>

    > 1. SMP<br>

    > 2. NON-SMP<br>

    ><br>

    > I've decided to stick with custom build non-smp version and the thing is:<br>

    > - i don't need cache - any kind of it<br>

<br>

      cache_mem 0<br>

      cache deny all<br>

<br>

    That is it. All other caches used by Squid *are* mandatory for good<br>

    performance. And are only used anyway when the component that needs them<br>

    is actively used.<br>

<br>

<br>

    > - I have DNS cache just for that<br>

    > - squid has to listen on 1024 ports on 23 instances.<br>

    > each instance listens on set of ports and each port has different outgoing<br>

    > ip address.<br>

<br>

    And how many NIC do you have that spread over?<br>

<br>

    ><br>

    > The thing is this:<br>

    > It's alll good until we hit it with more than 150mbits then...<br>

    ><br>

    > (output from perf top)<br>

    >  84.57%  [kernel]                  [k] osq_lock<br>

    >   4.62%  [kernel]                  [k] mutex_spin_on_owner<br>

    >   1.41%  [kernel]                  [k] memcpy<br>

    >   0.79%  [kernel]                  [k] inet_dump_ifaddr<br>

    >   0.62%  [kernel]                  [k] memset<br>

    ><br>

    >  21:53:39 up 7 days, 10:38,  1 user,  load average: 24.01, 23.84, 23.33<br>

    > (yes, we have 24 cores)<br>

    > Same behavior is with SMP and NON-SMP setup (SMP setup is all in one file<br>

    > with workers 23 option but then I have to use rock cache)<br>

    ><br>

    > so, my question is....what...how to optimize this.....whatever....I'm stuck<br>

    > for days, I've tried many sysctl options but none of them works.<br>

    > Any help, info, something else?<br>

<br>

    None of those are Squid functionality. If you want help optimizing your<br>

    config and are willing to post it to the list I am happy to do a quick<br>

    audit and point out any problem areas for you.<br>

<br>

    But tuning the internal locking code of the kernel is way off topic.<br>

<br>

    Amos<br>

<br>

    _______________________________________________<br>

    squid-users mailing list<br></div></div>

    <a href="mailto:squid-users@lists.squid-cache.org" target="_blank">squid-users@lists.squid-cache.org</a> <mailto:<a href="mailto:squid-users@lists.squid-cache.org" target="_blank">squid-users@lists.squid-cache.org</a>><br>

    <a href="http://lists.squid-cache.org/listinfo/squid-users" rel="noreferrer" target="_blank">http://lists.squid-cache.org/listinfo/squid-users</a><span><br>

<br>

<br>

<br>

<br>

_______________________________________________<br>

squid-users mailing list<br>

<a href="mailto:squid-users@lists.squid-cache.org" target="_blank">squid-users@lists.squid-cache.org</a><br>

<a href="http://lists.squid-cache.org/listinfo/squid-users" rel="noreferrer" target="_blank">http://lists.squid-cache.org/listinfo/squid-users</a><br>

<br>

</span></blockquote>

</blockquote></div><br></div></div>