[squid-users] Squid 3.5 https facebook caching

Amos Jeffries squid3 at treenet.co.nz
Wed Apr 17 13:52:03 UTC 2019


On 17/04/19 3:04 am, tester100 wrote:
> Hi guys
> 
> i am currently using this setup on my squid 3.5.28 version for https
> filtering using ssl certificate
> 
> its caching http and https (some specific extensions) on facebook i can
> cache images,css, and other javascript files..
> 
> aldo when i press play to play the video and try to cache it , it simply
> does not play any videos i can only play the live feeds transmission, this
> is the squid.conf files and the store-id.pl i am using
> 
> 
> 
> # SQUID CONFIGURATION OF CYBERSCIE.COM
> #
> 
> acl localnet src 10.0.0.0/8	# RFC1918 possible internal network
> acl localnet src 172.16.0.0/12	# RFC1918 possible internal network
> acl localnet src fc00::/7       # RFC 4193 local private network range
> acl localnet src fe80::/10      # RFC 4291 link-local (directly plugged)
> machines
> acl localnet src 192.168.1.0/24 
> acl localnet src 192.168.2.0/24
> 
> acl SSL_ports port 443
> acl SSL_ports port 5353
> acl Safe_ports port 21
> acl Safe_ports port 22
> acl Safe_ports port 53
> acl Safe_ports port 70
> acl Safe_ports port 80
> acl Safe_ports port 210
> acl Safe_ports port 280
> acl Safe_ports port 1025-65535

The above line means that any of the *many* entries you have for ports
over 1024 are a pointless waste of memory and CPU cycles.

Please start by running "squid -k parse" on your config and fix all the
issues that get mentioned.
...
> acl CONNECT method CONNECT
> 
> # ACCESS RULES
> http_access deny !Safe_ports
> http_access deny CONNECT !SSL_ports
> http_access allow localnet
> http_access allow localhost
> http_access deny all
> 
> # LISTENING PORT SQUID
> 
> http_port 3128 ssl-bump cert=/etc/squid/ssl_certs/myCA.pem
> generate-host-certificates=on dynamic_cert_mem_cache_size=4MB
> 
> 
> # CONNECTION HANDLING
> qos_flows local-hit=0x30
> collapsed_forwarding on	
> balance_on_multiple_ip on

The above breaks performance of server persistent connections....

> detect_broken_pconn on
> client_persistent_connections off
> server_persistent_connections on

... which is the only type of persistence you have enabled.

> 
> # DNS OPTIONS
> #dns_packet_max 4096
> dns_defnames on
> dns_v4_first on
> connect_retries 2
> negative_dns_ttl 1 second
> quick_abort_min 0
> quick_abort_max 0
> quick_abort_pct 80
> range_offset_limit 0

Hm, so all transactions which are started will run until completion even
if no client needs that response.

> ipcache_low 98
> ipcache_high 99
> ipcache_size 4096
> fqdncache_size 2048
> pipeline_prefetch 0
> 
> # MISCELEANOUS
> memory_pools off
> reload_into_ims on
> max_filedescriptors 65536
> 
> # CACHE MANAGEMENT
> cache_mem 512 MB
> maximum_object_size_in_memory 128 KB
> memory_replacement_policy heap GDSF
> cache_effective_group proxy	
> cache_effective_user proxy
> cache_dir aufs /cache/cache 100000 16 256
> coredump_dir /cache/cache
> cache_mgr cyberscie
> visible_hostname someone at gmail.com

The above looks like an email address. Not an FQDN.

"hostname" is the name of the machine running Squid. To work properly it
should be a FQDN that can be resolved with DNS.


> minimum_object_size 0 KB
> maximum_object_size 1 GB
> read_ahead_gap 64 KB		#Amount of data to buffer from server to client
> cache_replacement_policy heap LFUDA
> store_dir_select_algorithm least-load
> cache_swap_low 90
> cache_swap_high 95
> 
> # LOG FILE OPTIONS
> logfile_daemon /usr/lib/squid/log_file_daemon
> access_log daemon:/var/log/squid/access.log squid
> cache_log /dev/null		#cache_log /var/log/squid/cache.log(to enable)

This is a bad idea. What do you expect will happen to the machine when
Squid renames the /dev/null path to /dev/null.0 and places a file at
that location?


> cache_store_log none

It is not necessary to set things to their default value in Squid-3.

> logfile_rotate 3
> pid_filename /var/run/squid.pid
> 
> # FILTERING HTTPS
> acl 1 dstdomain .fbcdn.net .akamaihd.net .fbsbx.com
> #acl 2a dstdomain .mahadana.com .mql4.com .metaquotes.net
> acl 2 url_regex -i ^https?:\/\/attachment\.fbsbx\.com\/.*\?(id=[0-9]*).*
> acl 2 url_regex -i
> \.fbsbx\.com\/.*\/(.*\.(unity3d|pak|zip|exe|dll|jpg|png|gif|swf)/)$


The above "2" lines are pointless. The ACL called "1" has already
matched and allowed Store-ID processing of all the domains this pattern
might match.

> acl 2 url_regex -i ^https?:\/\/.*\.ytimg\.com(.*\.(webp|jpg|gif))

The above pattern does not match what it may seem to match.

Notice that;
  A) there is no path-segment delimiter ('/' or '\?') required. So the
thing that _looks_ like a file extension can match when existing in the
domain name (eg http://hello.ytimg.com.gif-fy.invalid/ will be allowed)
 B)


> acl 2 url_regex -i ^https?:\/\/([^\.]*)\.yimg\.com\/(.*)
> acl 2 url_regex -i ^https?:\/\/.*\.gstatic\.com\/images\?q=tbn\:(.*)



It is pointless to place "(.*)" or ".*" or ".+" at the start or end of a
regex pattern.

Arbitrary suffix is implicit and all this will do is slow the regex
processing down even further trying to match the entire (possibly VERY
long) URL against ".*"
 That goes for all places you use regex patterns.


> acl 2 url_regex -i
> ^https?:\/\/.*\.reverbnation\.com\/.*\/(ec_stream_song|download_song_direct|stream_song)\/([0-9]*).*
> acl 2 url_regex -i
> ^https?:\/\/([a-z0-9.]*)(\.doubleclick\.net|\.quantserve\.com|.exoclick\.com|interclick.\com|\.googlesyndication\.com|\.auditude\.com|.visiblemeasures\.com|yieldmanager|cpxinteractive)(.*)
> acl 2 url_regex -i ^https?:\/\/(.*?)\/(ads)\?(.*?)
> acl 2 url_regex -i ^https?:\/\/.*steampowered\.com\/.*\/([0-9]+\/(.*))
> acl 3 url_regex -i
> ^https?:\/\/(.*?)\/speedtest\/.*\.(jpg|txt|png|gif|swf)\?.*
> acl 3 url_regex -i speedtest\/.*\.(jpg|txt|png|gif|swf)\?.*

Notice that the second line for "3" matches everything the first pattern
does, and a lot more. You can erase the first pattern and save a lot of
CPU without any change in proxy allow/deny permissions.

> acl 4 url_regex -i reverbnation.*audio_player.*ec_stream_song.*$
> acl 5 url_regex -i utm.gif.*
> acl 6 url_regex -i c.android.clients.google.com.market.GetBinary.GetBinary.*
> acl 7 url_regex -i youtube.*(ptracking|stream_204|player_204|gen_204).*$
> acl 7 url_regex -i
> \.c\.(youtube|google)\.com\/(get_video|videoplayback|videoplay).*$
> acl 7 url_regex -i (youtube|google).*\/videoplayback\?.*
> acl 8 http_status 302
> acl getmethod method GET
> 
> ssl_bump splice localhost
> acl 9 at_step SslBump1
> acl 10 at_step SslBump2
> acl 11 at_step SslBump3
> ssl_bump peek 9 all
> ssl_bump bump 10 all
> ssl_bump bump 11 all

Why the weird numbers?

The "all" on all the above ssl_bump lines are pointless and may be
confusing you.


> 
> sslcrtd_program /usr/lib/squid/ssl_crtd -s /var/lib/squid/ssl_db -M 4MB
> sslcrtd_children 16 startup=1 idle=1
> sslproxy_capath /etc/ssl/certs
> sslproxy_cert_error allow all

Above prevents Squid from acting on TLS errors. In fact it can cause
some to be hidden which should be fatal to the transaction.


> sslproxy_flags DONT_VERIFY_PEER		#this line fixing www.gmail.com,

Absolutely No! The above line 'fixes' nothing. All it does is tell Squid
not to bother checking TLS security.

Those problems still exist, still cause other side effects, and are
often breaking things for clients. But you cannot see that because it is
being hidden by the above setting.

 ... and as a bonus (for the bad guys) your proxy can now be hijacked by
a malicious HTTPS server without any hints being given about it happening.


I suspect that whatever is going wrong the crypto activity is part of
it. But without those crypto issues being visible nobody can say for
sure. There are also refresh_pattern issues mentionend below.


> mail.yahoo.com for some errors
> always_direct allow all

Please remove this. You do not have any cache_peer lines. This setting
was only ever needed for a single Squid-3.2 release over a 2-week
period. That bug was long, long, long ago fixed.


> ssl_unclean_shutdown on
> 
> # STORE ID
> store_id_program /usr/bin/perl /etc/squid/store-id.pl
> store_id_children 10 startup=5 idle=2 concurrency=10
> store_id_access allow 1
> store_id_access allow 2
> store_id_access allow 3
> store_id_access allow 4
> store_id_access allow 5
> store_id_access allow 6

So if all the 1 thru 6 ACLs are just url_regex patterns. Why bother
having them as separate ACLs? you can list all the patterns in one ACL
and save a lot of CPU cycles and time.


> store_id_access allow 7
> store_miss deny 7 8
> send_hit deny 7 8

It would be a lot more performant to switch those around. Check 8 then
7. Like so:

  store_miss deny 8 7
  send_hit deny 8 7


> store_id_access deny all
> 

This should really be up with the other store_id_access lines.


> # TUNNING CACHE
> max_stale 1 years
> vary_ignore_expire on
> shutdown_lifetime 10 seconds
> 
> # REFRESH PATTERN
> refresh_pattern -i https?:\/\/.*\.xx\.fbcdn\.net\/.*\.(jpg|png) 43830 99%
> 259200 override-expire override-lastmod ignore-reload

Um,

1) override-lastmod is generally a bad idea. It prevents the
Last-Modified header telling Squid that an object as any previous time
since it was updated - this option actively *reduces* the time objects
can be cached.

2) overide-expire should not be used for sites like Facebook which
provide well behaved cacheability headers. Like (1) it actively breaks
caching with often the opposite result to what one wants.

3) ignore-reload is part of why your Browser "refresh" attempts are
failing to do anything at all.

4) override-expire is *shortening* the caching time for these objects.
Facebook actually has pretty good cacheability once you get past the
problem of it all being hidden behind crypto.


> refresh_pattern static\.(xx|ak)\.fbcdn\.net*\.(jpg|gif|png) 241920 99%
> 241920 ignore-reload override-expire ignore-no-store

5) ignore-no-store is a bad idea. This *forces* private details from one
persons FB profile pages to be delivered to other clients. It exists
only because there are some very badly designed sites abusing the
Cache-Control header. Facebook is *not* one of those sites.


> refresh_pattern ^https?\:\/\/profile\.ak\.fbcdn.net*\.(jpg|gif|png) 241920
> 99% 241920 ignore-reload override-expire ignore-no-store
> refresh_pattern (akamaihd|fbcdn)\.net 14400 99% 518400  ignore-no-store
> ignore-private ignore-reload ignore-must-revalidate store-stale


6) ignore-private has been made relatively safe in the latest Squid. BUT
the revalidation mechanisms are required for it to be safe at all.
 It is a very bad idea to use with either ignore-reload or
ignore-must-revalidate ... let alone both at once. Security
vulnerabilities will exist as a result of these options used together.

7) store-stale is in a similar position of requiring revalidation /
reload to be possible. But with less severe results - only badly broken
web page display.


These issues caused by (5), (6), and (7) could be at least a part of
what is going wrong. Probably also some other things.


> refresh_pattern (audio|video)\/(webm|mp4) 129600 99% 129600 ignore-reload
> override-expire override-lastmod ignore-must-revalidate  ignore-private
> ignore-no-store ignore-auth store-stale


Notice that if the earlier FB patterns did not match this makes videos
and audio URLs forced to be immediately stale/expired, forced to be
cached anyway, forced all clients/users to get the same objects, and
then also prohibits anything from updating the cache object if a broken
one gets into cache somehow.

You were saying something about video problems?


The remainder of your refresh_patterns show a lot of repeats of these
issues.

Remember: these options are dangerous. Use with great care. And
understand what the options are doing



> 
> This way i can play the facebook videos, but no caching is done i only get
> TCP_TUNNEL/200
> 
> 
> 
> http_port 3129 tproxy ssl-bump generate-host-certificates=on
> dynamic_cert_mem_cache_size=4MB cert=/etc/squid/ssl_certs/squid.crt
> key=/etc/squid/ssl_certs/squid.key
> cipher=ECDHE-RSA-RC4-SHA:ECDHE-RSA-AES128-SHA:DHE-RSA-AES128-SHA:DHE-RSA-CAMELLIA128-SHA:AES128-SHA:RC4-SHA:HIGH:!aNULL:!MD5:!ADH
> 


Do you know what the differences between these two port lines are?

In one the client is aware that the proxy exists and sends details to it
in a CONNECT request.


> 
> And this way i can cache https but cannot play the videos on the facebook at
> all
> 
> 
> 
> http_port 3128 ssl-bump cert=/etc/squid/ssl_certs/myCA.pem
> generate-host-certificates=on dynamic_cert_mem_cache_size=4MB
> 
> 
> 
> 
> Any ideas or hints on what could be wrong? i am kind of lost now.. any tips
> will be appreciate
> 
> 
> 
> --
> Sent from: http://squid-web-proxy-cache.1019090.n4.nabble.com/Squid-Users-f1019091.html
> _______________________________________________
> squid-users mailing list
> squid-users at lists.squid-cache.org
> http://lists.squid-cache.org/listinfo/squid-users
> 


More information about the squid-users mailing list