[squid-users] different results every time

Wed Aug 3 04:37:43 UTC 2016

On 3/08/2016 2:06 p.m., Sam M wrote:
> Reading through the documentation of Collapsed Forwarding feature I don't
> know if this feature would help as the problem to what I'm feeling is the
> squid eviction process and decision. It looks like squid is storing more
> what is being set in the cache_dir and not evicting the least recently used
> files at the right time because of the heavy request load.

A lot is going on. More on that below.

Also, there is no mistakes possible about the too-early eviction.
Because the "right time" is exactly when something else needs to use
that piece of cache space.  Under heavy traffic the time-based eviction
of data almost never happens, everything cycles out due to load pressure
far earlier thanit would naturalliy expire.  The rare pieces of data
that manage to reach their stale timeout are evicted *later* than that
staleness point.

> 
> Does that make sense, or is there other explanation to the issue I'm having?

Yes. CF is for overlapping requests from the client. It is one of the
things that may be going on. But not by default.

> On Tue, Aug 2, 2016 at 9:51 PM, Sam M wrote:
> 
>> Hi Eliezer,
>>
>> Thanks for your prompt reply. We are testing our squid configuration
>> before we use it. That said, all objects are 1 MB in size and in order to
>> test squid we queried a sequence of files multiple times in a manner that
>> theoretically at the end of the querying process we should get the same
>> number of hits from cache1, cache2, cache3, and cache4.
>>
>> Structure of test network is: User (using a script) -> cache1 -> cache2 ->
>> cache3 -> cache4 -> web server (stores the queried files).
>>
>> I'm gonna try the Collapsed Forwarding feature and will post back if this
>> fixes the issue.
>>

>>>
>>> *From:* Sam M
>>> *Sent:* Tuesday, August 2, 2016 8:43 AM
>>>
>>> Hi,
>>>
>>> I'm querying lots of files through 4 cache servers connected through
>>> parent hierarchy. I clean all the caches before I start and then I query
>>> the files again in the same exact order. Weirdly, every time I check the
>>> logs, I see a different cache served a file compared with the previous
>>> test. The query process is done through a python script that uses wget
>>> through a proxy to the cache, hence the query process is really fast.
>>>
>>> Interestingly, if I put a delay of 1 second between each query, the
>>> result will be stable and same every time I run the script.
>>>
>>> Following a snippet from the config file after changing it too many times
>>> to make it re-produce the same results yet, that didn't help:
>>> cache_dir ufs /var/spool/squid 9 16 256
>>> cache_mem 0 MB
>>> memory_pools off
>>> cache_swap_low 100
>>> cache_swap_high 100
>>> maximum_object_size_in_memory 0 KB
>>> cache_replacement_policy lru
>>> range_offset_limit 0
>>> quick_abort_min 0 KB
>>> quick_abort_max 0 KB
>>>
>>>
>>>
>>> Can someone shed some light on the issue and how to fix it please?
>>>

TL;DR: Does not sound like a problem to me. That behaviour is how HTTP
works.

HTTP is stateless by design. Each request is evaluated at each proxy
independently.

The network itself is dynamic, in timing and known-state. When you are
dealing with things on the nanosecond scale details as low down as ARP
cache and maybe lower, affect the RTT and thus the timing data Squid
stores about its peers and reachable servers. The HTTP object cache is
just one amongst many type of caches having effects - both inside and
outside Squid.

At longer timescales. DNS results can be differently ordered, or
rotating per-lookup, or plain different (but 'static') content per
lookup source.

Even the server memory access speed plays a part. By delaying traffic
(or not) by some nanoseconds during the cache index lookup.

There is also the absolute UTC timestamp of the request reaching the
origin server versus the Expires/Cache-Control/Age/Date headers it
produces. Which are also affected by all sorts of things internal to the
origin itself. The delta values of these timestamps relative to the
cache abolute UTC timestamp on recieving the response - is dynamic and
the amount of that dynamic increases the smaller the timescale one looks
at (ie faster he traffic).

[probably more there I've missed].

All those little details affect in some ways the choice to determine any
given request destination or whether it is served from cache. And that
determination is made independently by each of the proxies in the
traffic chain at the timepoint where each separate request passes
through it.

So with 4x caches in your chain all these tiny details are compounded 4x
times on each request. Of course its going to fluctuate. Even in
isolated test traffic.

HTTP has a 1sec resolution on caching calculations for good reason. And
even that is not enough to average out the entire affect when you
compount the clock variance with multiple layers of proxy.

Unless you wait multiples of whole seconds between each test request you
will be guaranteed to see at least some variance in the HIT vs MISS
behaviour. Even waiting you might see variation between which particular
cache was a HIT.

PS. Squid is only about 90% compliant with the HTTP/1.1 requirements. So
there are some known bugs in the caching logics that your testing may
encounter as well. Though at least bugs are "stable" in their behaviour
for a given proxy build.

HTH
Amos