[squid-users] Pages sometimes load as a mess of random (?) symbols

Thu Oct 5 07:39:02 UTC 2017

On 05/10/17 19:42, Grey wrote:
> Sorry for not including enough informatio nin the first place.
> 
> 1. Here's my config, keep in mind it's a test server that will eventually
> replace the one (not updated) we're using right now so the configuration is
> kinda bare-bones:
> 
> ### TESTSQUID1 ###
> 
> http_port 3128
> dns_v4_first on
> pinger_enable off
> netdb_filename none
> 
> error_default_language it
> cache_mgr helpdesk at test.it
> 
> acl SSL_ports port 443
> acl Safe_ports port 80		# http
> acl Safe_ports port 21		# ftp
> acl Safe_ports port 443		# https
> acl Safe_ports port 70		# gopher
> acl Safe_ports port 210		# wais
> acl Safe_ports port 1025-65535	# unregistered ports
> acl Safe_ports port 280		# http-mgmt
> acl Safe_ports port 488		# gss-http
> acl Safe_ports port 591		# filemaker
> acl Safe_ports port 777		# multiling http
> acl CONNECT method CONNECT
> 
> auth_param negotiate program /usr/lib/squid/negotiate_kerberos_auth -r -d
> auth_param negotiate children 150
> auth_param negotiate keep_alive on
> 
> external_acl_type ProxyUser children-max=75 %LOGIN
> /usr/lib/squid/ext_kerberos_ldap_group_acl -g INTERNET at TEST.LOCAL -D
> TEST.LOCAL -S testldap
> acl ProxyUser external ProxyUser
> 
> acl AUTH proxy_auth REQUIRED
> http_access deny !AUTH all

So two problems.
1) 'all' here means clients with incorrect OR missing auth credentials 
do not get challenged for working credentials. Since any sane client 
security system will not present credentials until told they are 
necessary the above should rightfully prevent *any* secure clients from 
using this proxy.

2) your custom config lines should be placed below the default security 
settings. This is especially important for ACLs like auth which involve 
a lot of background work. The default settings are there to block things 
like DoS or attacks that can be trivially and quickly denied, and to do 
so with minimal CPU expense.

> 
> http_access deny !Safe_ports all
> http_access deny CONNECT !SSL_ports all
> http_access allow localhost manager
> http_access deny manager all
> http_access allow localhost all

If you place the "allow localhost" above the "deny manager" you can 
remove one extra line of checks.

> 
> acl destsquid dstdomain .testquid1 .testsquid2
> http_access allow destsquid all

The 'all' ACL is a pointless waste of CPU cycles on all of the lines above.

> 
> http_access allow ProxyUser all
The 'all' ACL here *might* prevent unauthenticated clients from being 
challenged for credentials like the 'deny !AUTH' line did. But YMMV. It 
either does that or is pointless.

The current 3.5 provides the %un format code which should not generate 
an auth challenge. That should eliminate the need for the all-hack here.

> http_access deny all
> 
> icap_enable on
> icap_send_client_ip on
> icap_send_client_username on
> icap_client_username_encode off
> icap_client_username_header X-Authenticated-User
> icap_preview_enable on
> icap_preview_size 1024
> icap_service service_req reqmod_precache bypass=1
> icap://testicap:1344/REQ-Service
> adaptation_access service_req allow all
> icap_service service_resp respmod_precache bypass=0
> icap://testicap:1344/resp
> adaptation_access service_resp allow all
> 
> coredump_dir /var/spool/squid
> 
> refresh_pattern ^ftp:		1440	20%	10080
> refresh_pattern ^gopher:	1440	0%	1440
> refresh_pattern -i (/cgi-bin/|\?) 0	0%	0
> refresh_pattern .		0	20%	4320
> 
> 2. This is the access log when first loading the page:
> 
> 1507185342.611      0 99.99.99.99 TCP_DENIED/407 5179 GET
> http://www.tomshardware.com/ - HIER_NONE/- text/html
> 1507185344.121   1473 99.99.99.99 TCP_MISS/200 48225 GET
> http://www.tomshardware.com/ testuser HIER_DIRECT/23.40.112.227 text/html
> 
> And this is the one after reloading:
> 

By "reloading" do you mean:

  * using a testing tool that sends an identical repeat request? or
  * clicking + pressing enter in a browser address bar? or
  * pressing the browser reload button? or
  * pressing the force-refresh (F5) button? or
  * holding shift while doing any of the above?

Only the first two above methods will perform a clean HTTP test request. 
The others all deliver cache controls to force specific cache behaviour 
which void the test results.

> 1507185356.932    187 99.99.99.99 TCP_MISS/200 47858 GET
> http://www.tomshardware.com/ testuser HIER_DIRECT/23.40.112.227 text/html
> 1507185357.425      0 99.99.99.99 TCP_DENIED/407 4440 GET
> http://platform.twitter.com/widgets.js - HIER_NONE/- text/html
> 1507185357.482     13 99.99.99.99 TCP_MISS/200 2019 GET
> http://www.tomshardware.com/medias/favicon/favicon-32x32.png? testuser
> HIER_DIRECT/23.40.112.227 image/png
> 1507185357.548     61 99.99.99.99 TCP_REFRESH_UNMODIFIED/304 516 GET
> http://platform.twitter.com/widgets.js testuser HIER_DIRECT/199.96.57.6 -
> 1507185357.565      0 99.99.99.99 TCP_DENIED/407 4178 CONNECT
> www.tomshardware.com:443 - HIER_NONE/- text/html
> 1507185357.924      0 99.99.99.99 TCP_DENIED/407 4190 CONNECT
> syndication.twitter.com:443 - HIER_NONE/- text/html
> 
> 3. The result of the test at redbot
> (https://redbot.org/?uri=http%3A%2F%2Fwww.tomshardware.com%2F if you want to
> check it yourself) is:
> 
> General
> The Pragma header is deprecated.
> The Content-Length header is correct.
> Content Negotiation (Content Negotiation response )
> The resource doesn't send Vary consistently.

  ^^ this one is what I meant. There are several side effects of this - 
mostly just annoying MISS behaviours, but sometimes the wrong 
content-type can end up being associated with a cached object and things 
appear as you described the problem

Also, IIRC NginX (which appears to be the server for that site) was 
known to have several bugs that led to these types of broken 
content-type behaviour some years back. I'm not sure if that ever got fixed.

> The response body is different when content negotiation happens.
> Content negotiation for gzip compression is supported, saving 86%.
> Caching
> Pragma: no-cache is a request directive, not a response directive.
> This response can't be stored by a cache.
> 
> So it indeed seems that this could be the problem, right? Anything I can do
> on my end to resolve/mitigate it?

I see that the server is already sending out "Cache-Control: no-store" 
so the problem is not your Squid but something upstream. Just make sure 
you do not override that no-store for these sites and your proxy will 
continue not to be adding problems.

It may be the ICAP service mangling the response type from gzip to 
text/plain incorrectly (ie without actually unzipping), or removing the 
relevant cache-controls.

Or equally likely; a proxy upstream force-caching, thus making itself 
*and* yours run afoul of the Vary issue. This is where I suspect the 
NginX or some hidden intermediary.

Amos