[squid-users] What would be the maximum ufs\aufs cache_dir objects?

Eliezer Croitoru eliezer at ngtech.co.il
Mon Jul 17 17:34:15 UTC 2017


So basically from I understand the limit of the AUFS\UFS cache_dir is at:
16,777,215 Objects.
So for a very loaded system it might be pretty "small".

I have asked since:
I have seen the mongodb ecap adapter that stores chunks and I didn't liked it.
In the other way I wrote a cache_dir in GoLang which I am using for the windows updates caching proxy and for now it's surpassing the AUFS\UFS limits.

Based on the success of the Windows Updates Cache proxy which strives to cache only public objects, I was thinking about writing something similar for a more global usage.
The basic constrain on what would be cached is only If the object has Cache-Control "public".
The first step would be an ICAP service (respmod) which will log requests and response and will decide what GET results are worthy of later fetch.
Squid currently does things on-the-fly while the client transaction is fetched by the client.
For an effective cache I believe we can compromise on another approach which relays or statistics.
The first rule is: Not everything worth caching!!!
Then after understanding and configuring this we can move on to fetch *Public* only objects when they get a high repeated downloads.
This is actually how google cache and other similar cache systems work.
They first let traffic reach the "DB" or "DATASTORE" if it's the first time seen.
Then after more the a specific threshold they object is being fetched by the cache system without any connection to the transaction which the clients consume.
It might not be the most effective caching "method" for specific very loaded systems or specific big files and *very* high cost up-stream connections but for many it will be fine.
And the actual logic and implementation can be each of couple algorithms like LRU as the default and couple others as an option.

I believe that this logic will be good for specific systems and will remove all sort of weird store\cache_dir limitations.
I already have a ready to use system which I named "YouTube-Store" that allows the admin to download and serve specific YouTube videos to a local web-service.
It can be utilized together with an external_acl helper that will redirect clients to a special page that hosts cached\stored video with an option to bypass the cached version.

I hope to publish this system soon under BSD license.

Thanks,
Eliezer

----
Eliezer Croitoru
Linux System Administrator
Mobile: +972-5-28704261
Email: eliezer at ngtech.co.il



-----Original Message-----
From: squid-users [mailto:squid-users-bounces at lists.squid-cache.org] On Behalf Of Alex Rousskov
Sent: Friday, July 14, 2017 20:49
To: Amos Jeffries <squid3 at treenet.co.nz>; squid-users at lists.squid-cache.org
Subject: Re: [squid-users] What would be the maximum ufs\aufs cache_dir objects?

On 07/14/2017 10:47 AM, Amos Jeffries wrote:

> One UFS cache_dir can hold a maximum of (2^27)-1 safely. 

You probably meant to say (2^25)-1 but the actual number is (2^24)-1
because the sfileno is signed. This is why you get 16'777'215 (a.k.a.
0xFFFFFF) as the actual limit.


> The index hash entries are stored as a 32-bit bitmask (sfileno) - with 5
> bits for cache_dir ID and 27 bits for hash of the file details.

The cache index entries are hashed on their keys, not file numbers (of
any kind). The index entry is using 25 bits for the file number, but
IIRC, those 25 bits are never merged/combined with the 7 bits of the
cache_dir ID in any meaningful way.


Alex.

> typedef signed_int32_t sfileno;>     sfileno swap_filen:25; // keep in sync with SwapFilenMax
>     sdirno swap_dirn:7;
> enum { SwapFilenMax = 0xFFFFFF }; // keep in sync with StoreEntry::swap_filen

_______________________________________________
squid-users mailing list
squid-users at lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-users



More information about the squid-users mailing list