[squid-users] I would like to know performance sizing aspects.

Wed Aug 5 01:27:58 UTC 2020

On 5/08/20 11:28 am, m k wrote:
>> We are considering to use Squid for our proxy, and would like to know
>> performance sizing aspects.
>>
>> Current web access request averages per 1 hour are as followings 
>> Clients：30,000、
>> Page Views:141,741/hour
>> *Requests:4,893,106
>>

Okay. Requests and client count are the important numbers there.

The ~1359 req/sec is well within a default Squid capabilities, which can
extend up to around 10k req/sec before needing careful tuning.

That number was gained before HTTPS became so popular. So YMMV depending
on how many CONNECT tunnels you have to deal with. That HTTPS traffic
can possibly be decrypted and cached but performance trade-offs are
quite large.

>> We will install Squid on CentOS 8.1.   Please kindly share your
>> thoughts / advices

Whatever OS you are most comfortable with administering. Be aware that
CentOS official Squid packages are very slow to update - Apparently they
still have only v4.4 (8 months old) despite a 8.2 point release only a
few weeks ago.

So you may need to be building your own from sources and/or using other
semi-official packagers such as the ones from Eliezer at NGTech when he
gets around to CentOS 8 packages.
  <https://wiki.squid-cache.org/KnowledgeBase/CentOS>

FYI; If you find yourself having to use SSL-Bump, then we highly
recommended to follow the latest Squid releases with fairly frequent
updates (at minimum a few times per year - worst case monthly). If you
like CentOS you may find Fedora more suitable to track the security
environment volatility and update churn.

>> Is there sizing methodology and tools?

There are a couple of methodologies, depending on what aspect you are
tuning towards - and one for identifying the limitation points to begin
a tuning process tuning.

The info you gave above is the beginning. Checking to see if your
traffic rate is reasonably within capability of a single Squid instance.

Yours is reasonable, so next step is to get Squid running and see where
the trouble points (if any) are.

 For more see <https://wiki.squid-cache.org/SquidFaq/>

>> How much resources are generally recommended for our environment?
>> 　CPU:　 Memory:　 Disk space : Other factors to be considered if any:
>> Do you have a generally recommended performance testing tools? Any
>> suggested guidelines?
>>

 CPU - squid is still mostly single-process. So prioritize faster GHz
rates over core number. Multi-core can help of course, but not as much
as cycle speeds do. Hyper-threading is useless for Squid.

 Memory - Squid will use as much as you can give it. Let your budget
govern this.

 Disk - Squid will happily run with no disk - or lots of large ones.

   - Avoid RAID. Squid *will* shorten disk lifetimes with its unusually
high write I/O pattern. How much shorter varies by disk type (HDD vs
SSD). So you may find it better to plan budget towards maintenance costs
of replacing disks in future rather than buying multiple up-front for
RAID use.
 see <https://wiki.squid-cache.org/SquidFaq/RAID> for details.

    - Up to a few hundred GB per cache_dir can be good for large caches.
Going up to TB is not (yet) worth the disk cost as Squid has a per-cache
limit on stored objects.

   - Disk caches can be re-tuned, added, moved, removed, and/or extended
at any time and will depend on the profile of object sizes your proxy
handles - which itself likely changes over time. So general let your
budget decide the initial disks and work from there.

Load Testing - the tools us dev use to review performance are listed at
the bottom of the profiling FAQ page. These are best for testing the
theoretical limits of a particular installation - real traffic tends to
be somewhat lower. So I personally prefer taking stats from the running
proxy on real traffic and seeing what I can observe from those.

HTH
Amos