[squid-users] HDD/RAM Capacity vs store_avg_object_size
Amos Jeffries
squid3 at treenet.co.nz
Wed Jul 12 16:11:56 UTC 2017
On 12/07/17 22:31, bugreporter wrote:
> Hi,
>
> Can anybody help me to confirm my understanding of the memory usage vs the
> persistent cache capacity? Below my understanding:
>
> According to http://wiki.squid-cache.org/SquidFaq/SquidMemory:
>
> 1- We need 14 MB of memory per 1 GB on disk for 64-bit Squid.The wiki is
> there since I know squid (ie. i'm very old now). Is this information still
> valid?
Yes. It is a rough estimate based on the size of code objects used to
store each request message - they have not changed in at least the past
10 years. There may be some variance based on extra headers modern HTTP
contains. But that is not a huge amount and the number is a rough
estimate to begin with.
>
> 2- Is this assumption based on the default value of 13 KB for
> *store_avg_object_size*?
No.
That avg object size is for the full object with payload. Those payloads
are stored inside cache_mem or cache_dir, and do not take up index
space. So have a total limit of whatever you configure those storage
areas to be.
Squid uses the above directive for its startup initialization of the
index's hash table. The table can be changed dynamically, but that is
quite expensive in terms of CPU cycles and would delay some requests so
this is a nice shortcut to avoid most pauses.
The 10 or 14 MB is purely for the metadata necessary to index those
cached objects. Which is the HTTP message header text plus a bunch of
Squid code objects.
>
> 3- If answers to questions above are both YES, can we deduce that we need
> *182* bytes in memory per object in the persistent cache on 64x system?
> [*182* = (14 * 1024 * 1024) / (1024 * 1024 / store_avg_object_size)]
If you want to re-do the calculations for your own proxy start with the
values from the cachemgr "mem" report.
To get the metadata size add the per-object sizes (first number column)
of HttpReply + MemObject + HttpHeaderEntry + all objects whose name
starts with HttpHdr* + StoreEntry + all objects whose name starts with
StoreMeta*.
The rest is harder. You need to do a scan of a disk cache separating the
message headers - both counting the number of items found and total size
of the headers processed. Multiplying the metadata size by the number of
objects in the cache and adding the total message header size.
You now have total index size and total cache size for a given cache.
Getting the N per GB from that should be easy and obvious.
NP: The mgr:mem "In Use" count of StoreEntry gives you approximately the
number of currently indexed objects. Though it does includes some
non-cacheable objects being replied to currently so not completely
accurate. You can use that to see how the index memory use compares to
the memory use for extra in-transit data.
> 4- Today the *store_avg_object_size* should be really greater than 13 KB.
> The mean object size I can see on my own cache is about 100 KB. Can anybody
> refer me to a website where I can find fresh information?
The value for your particular Squid can be found in the cachemgr "info"
report. It is listed as "Mean Object Size".
It varies between proxies, and is directly dependent on what your
particular cache settings are compared to the traffic that proxy sees.
So even two proxies receiving the same traffic might show very different
values and it is unlikely that any reference material you find by other
people will be anything more than a rough approximation.
For example; my test proxy caching ISP-type traffic, with a fair bit of
Facebook, YouTube etc. going through it:
"
Mean Object Size: 106.08 KB
"
and a production CDN proxy in front of mostly Wordpress sites:
"
Mean Object Size: 19.20 KB
"
Both with a 200 GB cache_dir and otherwise default cache settings.
>
> 5- If I'm completely on a wrong way, can anybody help me to find a formula
> that can help me to deduce the required RAM for a given HDD capacity (and
> vice versa).
>
Still the same one listed in the wiki page.
Though nowdays the 2^27 objects per cache_dir limitation is proving to
be far more restrictive than the RAM index size. So depending on your
"Mean Object Size" you may find yourself limited to only using 100 GB or
less of a TB HDD.
Amos
More information about the squid-users
mailing list