[squid-users] Time for cache synchronization between siblings
bhsreenath at gmail.com
Thu Dec 17 12:21:28 UTC 2015
Thanks for the detailed response. I really appreciate it.
Unfortunately the load balancer we use is not a squid load balancer
and for now I will have to use HTCP.
Please take a look at the following lines from access.log of one of
the three squid servers.
1450351827.534 0 10.135.83.129 UDP_HIT/000 0 HTCP_TST
- HIER_NONE/- -
1450351827.562 20 10.135.83.129 TCP_HIT/200 553852 GET
- HIER_NONE/- video/mp2t
1450352028.731 0 10.135.83.128 UDP_MISS/000 0 HTCP_TST
- HIER_NONE/- -
The first line indicates a hit when queried by a peer. Note that the
IP address is 127.0.0.1.
It was a UDP HIT and it was followed by the actual request for the
cached object, which succeeded.
Now the third line indicates UDP query for same object, except that
URL has a different IP address, and the log says it was a MISS.
I don't know what I am doing wrong, but it consistently seems to treat
the IP address as part of the URL for purpose of HIT/MISS decision.
If all requests were made from a local client(say using curl running
locally on the machine) and using 127.0.0.1 as IP address, HTCP works
Even without HTCP, just issuing same request from localhost and
another machine using a the externally visible IP address, squid does
not appear to use cached object. I am new to HTTP and think I must be
doing something wrong, but cant say what.
I wonder if ICP would have fared better since it uses just the URL.
Might that be a reason?
On 12/17/15, Amos Jeffries <squid3 at treenet.co.nz> wrote:
> On 17/12/2015 3:10 a.m., Sreenath BH wrote:
>> Thanks for the tips. After disabling digest I believe performance
>> However, I found that randomly requests were being routed to parent
>> even when siblings had the data cached.
>> From access.log I found TIMEOUT_CARP. I assumed this meant HTCP timed
>> out and squid was forced to go to fetch the data. So I increased
>> icp_query_timeout to 4000 milliseconds, and the hit rate increased
>> But I still find that sometimes, even after getting a HIT response
>> from a sibling, squid, for some reason still decides to go to the
>> parent for requested object.
>> Are there any other reasons why squid will decide to go to parent
> Just quirks of timing I think. Squid tracks response latency and prefers
> the fastest source. If the parent is responding faster than the sibling
> for man requests over a short period then Squid might switch to using
> the parent as first choice for a
> Some traffic is also classified as "non-hierarchical". Meaning that it
> makes no sense sending it to a sibling unless all parents are down.
> Things such as CONNECT, OPTIONS, POST etc where the response is not
> possible to be cached at the sibling.
>> And another question: When the hash key is computed for storing cache
>> objects, does Squid use the hostname(or IP address) also as part of
>> URL, or just the part that appears after the hostname/IP:port numbers?
> No. The primary Store ID/key is the absolute URL alone. Unless you are
> using the Store-ID feature of Squid to change it to some other explicit
> string value.
> If the URL produces a reply object with Vary header, then the expansion
> of the Vary header format is appended to the primary Store ID/key.
>> For example: if ip address is squid servers is 10.135.85.2 and
>> 10.135.85.3, and a request made to 1st server would have had the IP
>> address as part of the URL. However, next time same request is made to
>> server2, a different IP address would be used. Does this affect cache
>> hit at the sibling server?
>> I think it should not, but is this the case?
> Correct the Squid IP has nothing to do with the cache storage.
>> We will have a load balancer that sends requests to each squid server,
>> and we want cache peering to work correctly in this case.
> FYI; the digest and HTCP algorithms you are dealing with are already
> load balancing algorithms. They are just designed for use in a flat
> 1-layer heirarchy.
> If you intend to have a 2-layer heirarchy (frontend LB and backend
> caches) I suggest you might want to look into Squid as the frontend LB
> using CARP algorithm. The CARP algorithm ensures deterministic storage
> locations for what URLs get sent to which caches. So there is no need
> for siblings communication as they all get unique URLs.
> * <http://wiki.squid-cache.org/ConfigExamples/SmpCarpCluster> has
> details of how to split the frontend and backend config. The specific
> example is for doing it using SMP workers within a single proxy
> instance. But the split can even more easily be done across different
> * <http://wiki.squid-cache.org/ConfigExamples/ExtremeCarpFrontend> has
> some details on how to add iptables port splitting on top of CARP to get
> ridiculously high performance out of a proxy heirarchy. The last numbers
> I heard from these setups were pushing just under the Gbps mark.
More information about the squid-users