[squid-users] AUFS vs. DISKS

Wed Jul 15 11:32:21 UTC 2015

On 15/07/2015 6:56 p.m., Stakres wrote:
> Hi All,
> 
> I face a weird issue regarding DISKS cache-dir model and I would like to
> have your expertise here 
> 
> Here is the result of a cache object with an AUFS cache_dir:
> 1436916227.603    462 192.168.1.88 00:0c:29:6e:2c:99 TCP_HIT/200 10486356
> GET http://proof.ovh.net/files/10Mio.dat - HIER_NONE/-
> application/octet-stream 0x30
> 
> Now, here is the same object from the same Squid box but using the DISKD
> cache_dir:
> 1436916293.648  24281 192.168.1.88 00:0c:29:6e:2c:99 TCP_HIT/200 10486356
> GET http://proof.ovh.net/files/10Mio.dat - HIER_NONE/-
> application/octet-stream 0x30
> 
> Do you see something weird ?
> This is the same Squid (3.5.5), I just changed from AUFS to DISKD and
> restarted the Squid...
> 
> Same object from the cache but *0.462 sec* in AUFS and *24.281 sec* in
> DISKD.
> 52 times more fast in AUFS, why ?
> 

Now theres a whole tale;

(NP: years are rough guesses)

[1980's]
In the beginning Squid was a single threaded process that did its own
HDD I/O access using fread()/fwrite() OSIX API calls to access a cache
format called UFS (Unix File System) with stored content.

Squid could then race at speeds of up to 50 req/sec on the hardware of
that day. In the ages of 9600 baud modems this was fine, but time flew
by, users increased in numbers and also went on to faster 56K modem
technology.

[1990's]
Experiments were done and it was shown that moving the I/O into a
separate process (a helper daemon) was 4x faster. Even though the file
data was copied in memory between the helper and Squid before relaying
to the users. Simply pushing the I/O load into a helper increased Squids
ability to seervice more requests. This was named diskd (Disk Daemon),
an alternative for UFS.

Squid could then race at speeds of up to 200 req/sec on the hardware of
those newer days. But time continues, user numbers kept increasing and
multi-MB speed DSL technology came along.

[2000's]
Multi-threaded applications became popular in mainstream computing. A
new module was added that pulled the I/O code back into the main Squid
process memory, but used threads to split the network and disk I/O
processing apart. (So Squid was no longer truely single-threaded
although people would continue for more than a decade making false
claims of such). This avoided the memory copying between Squid and diskd
helpers, and enabled much more parallel access to the HDD in
asynchronous operations. This was named AUFS (Asynchronous UFS).

Squid could then race at speeds of up to 600 req/sec on the hardware of
those newer days. But time continues still, user numbers keep on
increasing and multi-GB speed Fibre technology came along.

Hardware speeds have since doubled and tripled the basic achievable
speeds for each of these I/O methods. UFS achieving 150-200 req/sec,
diskd 200-300 req/sec and AUFS exploding up towards 600 req/sec on
multi-core CPUs. But still it was not enough. Too much time was spent in
waiting for disk reads.

[2005-ish]
Some people had the idea that all this waiting per-file was the cause of
their trouble and experimented with exotic things.

First came the Cyclic Object Storage System (COSS) which took responses
and stored them in great big groups. Reading and writing to the disk
only by the thousand. Tricky this was, with many bugs, and timing of
resposes began to matter a lot. Nevertheless despite those problems with
most I/O now only involving RAM Squid could reach speeds of nearly 900
req/sec.

Sadly these were also the years of the Squid-2.6/2.7 forking. The
Squid-3 version of COSS never saw many bug fixes implemeted for those
versions and was quite bad.

[2010's]
Later more experiments were done and a database storage system was
implemented (Rock [which does not stand for anything I'm aware of]).
Which contained all the benefits of memory access instead of disk for
most I/O, disk I/O for objects by the thousand, and also a more
predictable memory location for each regardless of whether it was
actually in memory or on disk.

At the same time multi-core processor support was being added to Squid
in the form of multi-process support. So the Rock storage was again
split off into a separate daemon-like process (Disker), but this time
utilizing threads and memory sharing as well.

Todays Squid can reach upwards of 2000 req/sec for a single
worker+Disker [on fairly average hardware]. Or upwards of 8000 req/sec
with several workers and Diskers on high end hardware.

> Any idea to speed the diskd up or at least reduce it ?

Nope. It's already going as fast as its little single-thread can race.

> I could understand the response times could not be the same, but here this
> is the Grand Canyon !
> 

:-) Have pity on those poor Windows admin then. Capped out at 200req/sec
with direct UFS I/O. *and* having to share a mere 2048 sockets between
both disk and network I/O.

> My cache_dir option used in test:
> cache_dir diskd /var/spool/squid3w1 190780 16 256 min-size=0
> max-size=293038080
> 

Most of your slowdown will be on small 0-32KB sized disk hits.

If you graph your traffic profile (count of objects at each size) you
will see a couple of significant bumps. One for small objects down
around 0-32KB range. One for larger media objects up around 5-20 MB.
These days maybe smaller bump(s) [wiggles?] around the 100-700 KB range.

Current recommendation is to use a Rock cache for the small objects. And
one or two of whichever is your OS fastest UFS-based cache type for the
larger ones.

If you have a significant number of those mid-range KB objects that need
high speeds, then an extra Rock cache tuned for larger cell sizes and
only storing those big-ish objects might also be useful.

Amos