[squid-users] Dynamic/CDN Content Caching Challenges

Muhammad Faisal faisalusuf at yahoo.com
Thu Apr 14 08:03:58 UTC 2016


Hi,
I'm trying to deal with dynamic content to be cached by Squid 3.5 (i 
tried many other version of squid e.g 2.7, 3.1, 3.4). By Dynamic I mean 
the URL for the actual content is always change this results in the 
wastage of Cache storage and low hit rate. As per my understanding I 
have two challenges atm:

1- Websites with dynamic URL for requested content (e.g filehippo, 
download.com etc etc)
2- Streaming web sites where the dynamic URL has 206 (partial content) 
tune.pk videos for e.g or windows updates (enabling range off set limit 
to -1 causes havoc on upstream to we kept it disable is there some way 
to control the behavior ?)

If someone has successfully configured the above scenario please help me 
out as i dont have programming background to deal with this complexity.

I tried using different store-ID helpers but no saving on upstream the 
content is still coming from origin. Below is the helper i have used:

My Setup Summary:
Centos 6.5
Tproxy
Single Ethernet
Squid v3.5.16 (yum installed from repo)

Squid Cache: Version 3.5.16
Service Name: squid
configure options:  '--build=x86_64-redhat-linux-gnu' 
'--host=x86_64-redhat-linux-gnu' '--target=x86_64-redhat-linux-gnu' 
'--program-prefix=' '--prefix=/usr' '--exec-prefix=/usr' 
'--bindir=/usr/bin' '--sbindir=/usr/sbin' '--sysconfdir=/etc' 
'--datadir=/usr/share' '--includedir=/usr/include' '--libdir=/usr/lib64' 
'--libexecdir=/usr/libexec' '--sharedstatedir=/var/lib' 
'--mandir=/usr/share/man' '--infodir=/usr/share/info' '--verbose' 
'--exec_prefix=/usr' '--libexecdir=/usr/lib64/squid' 
'--localstatedir=/var' '--datadir=/usr/share/squid' 
'--sysconfdir=/etc/squid' '--with-logdir=$(localstatedir)/log/squid' 
'--with-pidfile=$(localstatedir)/run/squid.pid' 
'--disable-dependency-tracking' '--enable-follow-x-forwarded-for' 
'--enable-auth' 
'--enable-auth-basic=DB,LDAP,NCSA,NIS,PAM,POP3,RADIUS,SASL,SMB,getpwnam' 
'--enable-auth-ntlm=smb_lm,fake' '--enable-auth-digest=file,LDAP' 
'--enable-auth-negotiate=kerberos,wrapper' 
'--enable-external-acl-helpers=wbinfo_group,kerberos_ldap_group' 
'--enable-cache-digests' '--enable-cachemgr-hostname=localhost' 
'--enable-delay-pools' '--enable-epoll' '--enable-icap-client' 
'--enable-ident-lookups' '--enable-linux-netfilter' 
'--enable-removal-policies=heap,lru' '--enable-snmp' 
'--enable-storeio=aufs,diskd,ufs,rock' '--enable-wccpv2' '--enable-esi' 
'--enable-ssl-crtd' '--enable-icmp' '--with-aio' 
'--with-default-user=squid' '--with-filedescriptors=16384' '--with-dl' 
'--with-openssl' '--with-pthreads' '--with-included-ltdl' 
'--disable-arch-native' '--without-nettle' 
'build_alias=x86_64-redhat-linux-gnu' 
'host_alias=x86_64-redhat-linux-gnu' 
'target_alias=x86_64-redhat-linux-gnu' 'CFLAGS=-O2 -g -pipe -Wall 
-Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector 
--param=ssp-buffer-size=4 -m64 -mtune=generic' 'CXXFLAGS=-O2 -g -pipe 
-Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector 
--param=ssp-buffer-size=4 -m64 -mtune=generic -fPIC' 
'PKG_CONFIG_PATH=:/usr/lib64/pkgconfig:/usr/share/pkgconfig' 
--enable-ltdl-convenience

-----------------------------------------------
Store ID Helper
===============================================
#!/usr/bin/perl
# Improved\Converted StoreID helper based on perl and not on ruby
# this helper dosn't have my logic about youtube and other StoreID
# but it still do the trick in most cases and for my understanding.
# if you do want to understand the script logic rather then just
# use it try to look at: http://www1.ngtech.co.il/paste/1016/
# Eliezer Croitoru eliezer<at>ngtech.co.il

$|=1;
while (<>) {
         chomp;
         @X = split;
         if (@X[0] =~ m/^(exit|quit|x|q)/) {
                 print STDERR "quiting helper quietly\n";
                 exit 0;
         }


if ($X[0] =~ 
m/^http\:\/\/.*(youtube|google).*(videoplayback|liveplay).*/){
         @itag = m/[&?](itag=[0-9]*)/;
         @id = m/[&?](id=[^\&]*)/;
         @range = m/[&?](range=[^\&\s]*)/;
         @begin = m/[&?](begin=[^\&\s]*)/;
         @redirect = m/[&?](redirect_counter=[^\&]*)/;
         
$out="http://video-srv.youtube.com.squid.internal/@id&@itag&@range@begin@redirect";

} elsif ($X[0] =~ 
m/^http\:\/\/.*(profile|photo|creative).*\.ak\.fbcdn\.net\/((h|)(profile|photos)-ak-)(snc|ash|prn)[0-9]?(.*)/) 
{
         $out="http://fbcdn.net.squid.internal/" . $2  . "fb" .  $6  ;

} elsif ($X[0] =~ m/^http:\/\/i[1-4]\.ytimg\.com\/(.*)/) {
         $out="http://ytimg.com.squid.internal/" . $1 ;

} elsif ($X[0] =~ m/^http:\/\/.*\.dl\.sourceforge\.net\/(.*)/) {
           $out="http://dl.sourceforge.net.squid.internal/" . $1 ;

                 #Speedtest
} elsif ($X[0] =~ m/^http\:\/\/.*\/speedtest\/(.*\.(jpg|txt)).*/) {
         $out="http://speedtest.squid.internal/" . $1 ;

                 #BLOGSPOT
} elsif ($X[0] =~ m/^http:\/\/[1-4]\.bp\.(blogspot\.com.*)/) {
         $out="http://blog-cdn." . $1  ;

                 #AVAST
} elsif ($X[0] =~ m/^http:\/\/download[0-9]{3}.(avast.com.*)/) {
           $out="http://avast-cdn." . $1  ;

               #AVAST
} elsif ($X[0] =~ m/^http:\/\/[0-9]*\.[0-9]*\.[0-9]*\.[0-9]*\/(iavs.*)/) 
{
         $out="http://avast-cdn.avast.com/" . $1  ;

         #KAV
} elsif ($X[0] =~ m/^http:\/\/dnl-[0-9]{2}.(geo.kaspersky.com.*)/) {
           $out="http://kav-cdn." . $1  ;

                 #AVG
} elsif ($X[0] =~ m/^http:\/\/update.avg.com/) {
           $out="http://avg-cdn." . $1  ;

                 #maps.google.com
} elsif ($X[0] =~ 
m/^http:\/\/(cbk|mt|khm|mlt|tbn)[0-9]?(.google\.co(m|\.uk|\.id).*)/) {
         $out="http://" . $1  . $2 ;

                 #gstatic and/or wikimapia
} elsif ($X[0] =~ 
m/^http:\/\/([a-z])[0-9]?(\.gstatic\.com.*|\.wikimapia\.org.*)/) {
         $out="http://" . $1  . $2 ;

                 #maps.google.com
} elsif ($X[0] =~ m/^http:\/\/(khm|mt)[0-9]?(.google.com.*)/) {
         $out="http://" . $1  . $2 ;

                 #Google
} elsif ($X[0] =~ 
m/^http:\/\/www\.google-analytics\.com\/__utm\.gif\?.*/) {
         $out="http://www.google-analytics.com/__utm.gif\n";

} elsif ($X[0] =~ m/^http:\/\/(www\.ziddu\.com.*\.[^\/]{3,4})\/(.*?)/) {
         $out="http://" . $1 ;

                 #cdn, varialble 1st path
} elsif (($X[0] =~ /filehippo/) && 
(m/^http:\/\/(.*?)\.(.*?)\/(.*?)\/(.*)\.([a-z0-9]{3,4})(\?.*)?/)) {
         @y = ($1,$2,$4,$5);
         $y[0] =~ s/[a-z0-9]{2,5}/cdn./;
         $out="http://" . $y[0] . $y[1] . "/" . $y[2] . "." . $y[3] ;

                 #rapidshare
} elsif (($X[0] =~ /rapidshare/) && 
(m/^http:\/\/(([A-Za-z]+[0-9-.]+)*?)([a-z]*\.[^\/]{3}\/[a-z]*\/[0-9]*)\/(.*?)\/([^\/\?\&]{4,})$/)) 
{
         $out="http://cdn." . $3 . "/squid.internal/" . $5 ;

                 #for yimg.com video
} elsif ($X[0] =~ 
m/^http:\/\/(.*yimg.com)\/\/(.*)\/([^\/\?\&]*\/[^\/\?\&]*\.[^\/\?\&]{3,4})(\?.*)?$/) 
{
         $out="http://cdn.yimg.com/" . $3 ;

                 #for yimg.com doubled
} elsif ($X[0] =~ 
m/^http:\/\/(.*?)\.yimg\.com\/(.*?)\.yimg\.com\/(.*?)\?(.*)/) {
         $out="http://cdn.yimg.com/"  . $3 ;

                 #for Filehippo files
} elsif ($X[0] =~ 
m/^https?:\/\/.*\.(filehippo\.com)\/.*\/(.*[\.exe|zip|cab|msi|mru|mri|bz2|gzip|tgz|rar|pdf])/) 
{
                 $out="http://filehippo.sqinternal/" . $1 . $2 ;


                 #for yimg.com with &sig=
} elsif ($X[0] =~ m/^http:\/\/([^\.]*)\.yimg\.com\/(.*)/) {
         @y = ($1,$2);
         $y[0] =~ s/[a-z]+([0-9]+)?/cdn/;
         $y[1] =~ s/&sig=.*//;
         $out="http://" . $y[0] . ".yimg.com/"  . $y[1] ;

} else {
         $out="ERR";

}
if ( $out =~ m/^http\:\/\/.*/) {
  print "OK store-id=$out\n" ;
} else {
  print "ERR\n" ;
}
}
--
Regards,
Faisal.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.squid-cache.org/pipermail/squid-users/attachments/20160414/b3c5bcb2/attachment-0001.html>


More information about the squid-users mailing list