[squid-users] Inconsistent accessing of the cache, craigslist.org images, wacky stuff.
Jester Purtteman
jester at optimera.us
Wed Oct 28 18:24:50 UTC 2015
> -----Original Message-----
> From: Amos Jeffries [mailto:squid3 at treenet.co.nz]
> Sent: Wednesday, October 28, 2015 10:31 AM
> To: Jester Purtteman <jester at optimera.us>; squid-users at lists.squid-
> cache.org
> Subject: Re: [squid-users] Inconsistent accessing of the cache, craigslist.org
> images, wacky stuff.
>
> On 29/10/2015 4:06 a.m., Jester Purtteman wrote:
> >
> >
> >> -----Original Message-----
> >> From: squid-users [mailto:squid-users-bounces at lists.squid-cache.org]
> >> On Behalf Of Amos Jeffries
> >> Sent: Tuesday, October 27, 2015 9:07 PM
> >> To: squid-users at lists.squid-cache.org
> >> Subject: Re: [squid-users] Inconsistent accessing of the cache,
> >> craigslist.org images, wacky stuff.
> >>
> >> On 28/10/2015 2:05 p.m., Jester Purtteman wrote:
> >>> So, here is the problem: I want to cache the images on craigslist.
> >>> The headers all look thoroughly cacheable, some browsers (I'm
> >>> glairing at you
> >>> Chrome) send with this thing that requests that they not be
> >>> cachable,
> >>
> >> "this thing" being what exactly?
> >>
> > Thing -> rest of the request, (you'd think someone who spoke a
> > language their entire life could use it, but clearly still need
> > practice :)
> >
> >> I am aware of several nasty things Chrome sends that interfere with
> >> optimal HTTP use. But nothing that directly prohibits caching like you
> describe.
> >>
> >
> > The chrome version of the headers have two lines that make my eye
> twitch:
> >
> > Cache-Control: max-age=0
> > Upgrade-Insecure-Requests: 1
> >
> > Which (unless I don't understand what's going on, which is quite possible)
> means "I don't want the response cached, and if possible, could we securely
> transfer this picture of an old overpriced tractor? It's military grade
> intelligence information here that bad guys are trying to steal". Am I
> interpreting that wrong?
> >
>
> max-age=0 from client means "dont use whatever you have cached. Always
> go to the server for new content.".
> The rest you got.
>
> Are you using the reload or refresh button in your testing? that is expected
> to cause that max-age value.
>
> If it is just sending that all the time anyway we will need to update our hacks.
>
Refresh button, so that at least makes sense.
>
> >
> > So getting lazy and using 8.8.8.8 because I don't have to remember which
> server I installed bind or dnsmasq on has finally come back to haunt me... I
> actually had a nightmare of a time getting another system working over the
> same problem, I'm giving this a rating of highly plausible. I'll revise the
> structure, if that fixes the issue, I'll let you know.
> >
>
> Introducing auto-configuration :-)
>
> You don't have to configure Squid with dns_nameservers at all these days. If
> you omit it entirely Squid will load the systems resolv.conf and use whatever
> resolver(s) are in there.
>
>
The problem was that the /etc/resolv.conf was also pointing to different name-servers for my squid cache than the clients were. Once I gave them all the same name server, everything else fell into step, and I discovered that because... read on
> >>>
> >>> So, big question, what debug level do I use to see this thing making
> >>> decisions on whether to cache, and any tips anyone has about this
> >>> would be appreciated. Thank you!
> >>
> >> debug_options 85,3 22,3
> >>
> >
> > I have used 22,3 which I gleaned from another post on this list, I find a lot of
> this in my cache.log:
> >
> > 2015/10/27 18:23:18.402| ctx: enter level 0:
> 'http://images.craigslist.org/00707_cL1v48AjUBR_300x300.jpg'
> > 2015/10/27 18:23:18.402| 22,3| http.cc(328) cacheableReply: NO because
> e:=p2XDIV/0x24afa00*3 has been released.
> > 2015/10/27 18:23:18.409| ctx: exit level 0
> >
> > I'll let you know if fixing DNS takes that out.
> >
>
> Hmm. I'm interested now. Will look that up when I have time later.
>
> Amos
So, after I read your first reply, I responded with a quick snippet of log file that came from the logging level 85,3, it looked like:
"""QUOTEING ANOTHER EMAIL"""
2015/10/28 09:16:54.075| 85,3| client_side_request.cc(532) hostHeaderIpVerify: FAIL: validate IP 208.82.238.226:80 possible from Host:
2015/10/28 09:16:54.075| 85,3| client_side_request.cc(543) hostHeaderVerifyFailed: SECURITY ALERT: Host header forgery detected on local=208.82.238.226:80 remote=192.168.2.56 FD 20 flags=17 (local IP does not match any domain IP) on URL: http://seattle.craigslist.org/favicon.ico
""" END QUOTE """
Based on my reading of http://wiki.squid-cache.org/KnowledgeBase/HostHeaderForgery it appears this is actually intended behavior. That also explains why it is being released and was found non-cacheable.
So, I just installed dnsmasq on two of my servers, pointed my clients toward that address, and so far it is working a whoel lot better. My hit rate is up in the 10% range, and that is with a nearly empty cache, so that may be the trick. I only made the change about a short time ago. More importantly, that error in the log has gone away and I am getting consistent caching behavior, so that is huge.
Thank you!
More information about the squid-users
mailing list