[squid-users] Proxy server to support a large number of simultaneous requests
Alex Rousskov
rousskov at measurement-factory.com
Wed May 31 13:43:01 UTC 2023
On 5/31/23 02:56, Andrey K wrote:
> > Do you get close to 100% hit ratio if clients access these URLs
> > sequentially rather than concurrently? If not, then focus on that
> > problem before you open the collapsed forwarding Pandora box.
> When I run curl sequentially like this:
> for i in `seq 500`; do curl --tlsv1.2 -k --proxy 0001vsg01:3131 -v
> $URL >/dev/null 2>&1; done
> I get only the first request with a status TCP_MISS and all others with
> TCP_MEM_HIT:
> Cnt Status Parent
> 499 TCP_MEM_HIT/200/- HIER_NONE/-
> 1 TCP_MISS/200/200 FIRSTUP_PARENT/parent_proxy
Excellent. This confirms that your Squid can successfully cache this
object (in memory).
> It is interesting to note that on both squid versions if I run a
> separate curl after processing 500 or 200 concurrent threads, I get a
> result with the status TCP_MISS/200
The next step I would recommend is to study the very first cache miss
_after_ the 500 or 200 concurrent threads test. Doing so may shed light
on why Squid is refusing to serve that (presumably cached) object from
the cache. I suspect that the object was marked for deletion earlier,
but we should check before spending more time on more complex triage of
concurrent cases. If you can share a (link to) compressed ALL,9
cache.log from that single transaction against Squid v6, I may be able
to help you with that step.
Cheers,
Alex.
> > What is your Squid version? Older Squids have more collapsed forwarding
> > bugs than newer ones. I recommend testing with Squid v6 or master/v7, at
> > least to confirm that the problem is still present in the latest
> > official code.
> I run tests on SQUID 5.9.
> We compiled 6.0.2 (with disabled delay-pools) and increased memory
> parameters:
> cache_mem 2048 MB
> maximum_object_size_in_memory 10 MB
> The complete configuration is shown below.
>
> Now on the version 6.0.2 we have the next results:
> 500 threads - Hit ratio 3.8%:
> 3 TCP_CF_HIT/200/- HIER_NONE/-
> 2 TCP_CF_MISS/200/200 FIRSTUP_PARENT/parent_proxy
> 16 TCP_HIT/200/- HIER_NONE/-
> 467 TCP_MISS/200/200 FIRSTUP_PARENT/parent_proxy
> 12 TCP_SWAPFAIL_MISS/200/200 FIRSTUP_PARENT/parent_proxy
>
> 200 threads - 6%
> 6 TCP_CF_HIT/200/- HIER_NONE/-
> 10 TCP_HIT/200/- HIER_NONE/-
> 176 TCP_MISS/200/200 FIRSTUP_PARENT/parent_proxy
> 8 TCP_SWAPFAIL_MISS/200/200 FIRSTUP_PARENT/parent_proxy
>
> 50 threads - 82%
> 30 TCP_CF_HIT/200/- HIER_NONE/-
> 11 TCP_HIT/200/- HIER_NONE/-
> 1 TCP_MISS/200/200 FIRSTUP_PARENT/parent_proxy
> 8 TCP_SWAPFAIL_MISS/200/200 FIRSTUP_PARENT/parent_proxy
>
> The results are slightly worse than they were on the version 5.9.
> It is interesting to note that on both squid versions if I run a
> separate curl after processing 500 or 200 concurrent threads, I get a
> result with the status TCP_MISS/200, although the requested URL is
> already in the rock cache (I can see it in the contents of the cache
> using the utility I developed rock_cache_dump.pl
> <http://rock_cache_dump.pl>:
> $VAR1 = {
> '1' => {
> 'VERSION' => 'Wed May 31 09:18:05 2023',
> 'KEY_MD5' => 'e5eb10f0ab7d84ff9d3fd1e5a6d3eb9c',
> 'OBJSIZE' => 446985,
> 'STD_LFS' => {
> 'lastref' => 'Wed May 31 09:18:05 2023',
> 'flags' => '0x4004',
> 'expires' => 'Wed May 31 15:18:05 2023',
> 'swap_file_sz' => 0,
> 'refcount' => 1,
> 'lastmod' => 'Wed Jun 29 16:09:14 2016',
> 'timestamp' => 'Wed May 31 09:18:05 2023'
> },
> 'URL' =>
> 'https://ia800406.us.archive.org/13/items/romeo-y-julieta-texto-completo/Romeo%20y%20Julieta%20-%20William%20Shakespeare.pdf <https://ia800406.us.archive.org/13/items/romeo-y-julieta-texto-completo/Romeo%20y%20Julieta%20-%20William%20Shakespeare.pdf>'
> }
> };
>
> ).
>
> > How much RAM does your server have? You are using default 256MB memory
> > cache (cache_mem). If you have spare memory, make your memory cache much
> > larger: A rock cache_dir cannot (yet) share the response _while_ the
> > response is being written to disk, so relying on cache_dir too much will
> > decrease your hit ratio, especially in a collapsed forwarding
> environment.
> The VM has 32 GB RAM. I configured cache_mem 2048 MB on the 6.0.2 version.
>
> > Is your Squid built with --enable-delay-pools? If yes, TCP_MISS does not
> > necessarily mean a cache miss (an old Squid bug), even if you do not use
> > any delay pools.
> Yes, delay pools on the version 5.9 were enabled though we don't use
> them. I disabled this feature on the 6.0.2 version.
>
>
> > Since you are trying to cache objects lager than 512KB, see
> > maximum_object_size_in_memory.
> I configured maximum_object_size_in_memory 10 MB on the 6.0.2 version
> (as videochunks are less than 7 MB).
>
> > Consider making your test much longer (more sequential requests per
> > client/curl worker), to see whether the cache becomes "stable" after one
> > of the first transactions manages to fully cache the response. This may
> > not help with older Squids, but might help with newer ones. However, you
> > should not test using real origin servers (that you do not control)!
> I don't have any of my own web servers for tests, so I choose some
> resources on the public internet that have a robust infrastructure.
> I will conduct the longer tests next week.
>
> Kind regards,
> Ankor.
>
> *squid.conf*
> workers 21
>
> sslcrtd_program /data/squid.user/usr/lib/squid/security_file_certgen -s
> /data/squid.user/var/lib/squid/ssl_db -M 20MB
> sslcrtd_children 21
>
> logformat extended-squid %{%Y-%m-%d %H:%M:%S}tl| %6tr %>a
> %Ss/%03>Hs/%<Hs %<st %rm %ru %un %Sh/%<A %mt %ea
>
> logfile_rotate 0
> access_log daemon:/var/log/squid.user/access.log
> logformat=extended-squid on-error=drop
>
> cache_peer parent_proxy parent 3128 0
> never_direct allow all
>
> cachemgr_passwd pass config
>
> acl PURGE method PURGE
> http_access allow PURGE
>
> http_access allow all
>
> http_port 3131 ssl-bump generate-host-certificates=on
> dynamic_cert_mem_cache_size=20MB
> tls-cert=/etc/squid.user/sslbump/bump.crt
> tls-key=/etc/squid.user/sslbump/bump.key
> sslproxy_cert_error allow all
>
> acl step1 at_step SslBump1
> acl step2 at_step SslBump2
> acl step3 at_step SslBump3
>
> ssl_bump peek step1
> ssl_bump bump step2
> ssl_bump bump step3
>
> cache_dir rock /data/squid.user/cache 20000 max-size=12000000
> cache_swap_low 85
> cache_swap_high 90
>
> collapsed_forwarding on
> cache_mem 2048 MB
> maximum_object_size_in_memory 10 MB
>
> pinger_enable off
> max_filedesc 8192
> shutdown_lifetime 5 seconds
> netdb_filename none
> log_icp_queries off
>
> via off
> forwarded_for delete
>
> client_request_buffer_max_size 100 MB
>
> coredump_dir /data/squid.user/var/cache/squid
>
>
>
>
> пн, 29 мая 2023 г. в 23:17, Alex Rousskov
> <rousskov at measurement-factory.com
> <mailto:rousskov at measurement-factory.com>>:
>
> On 5/29/23 10:43, Andrey K wrote:
>
> > We need to configure a dedicated proxy server to provide caching of
> > online video broadcasts in order to reduce the load on the uplink
> proxy.
> > Hundreds of users will access the same video-chunks simultaneously.
> >
> > I developed a simple configuration for the test purposes (it is
> shown
> > below).
> > The *collapsed_forwarding* option is on.
>
> Do you get close to 100% hit ratio if clients access these URLs
> sequentially rather than concurrently? If not, then focus on that
> problem before you open the collapsed forwarding Pandora box.
>
> What is your Squid version? Older Squids have more collapsed forwarding
> bugs than newer ones. I recommend testing with Squid v6 or
> master/v7, at
> least to confirm that the problem is still present in the latest
> official code.
>
> How much RAM does your server have? You are using default 256MB memory
> cache (cache_mem). If you have spare memory, make your memory cache
> much
> larger: A rock cache_dir cannot (yet) share the response _while_ the
> response is being written to disk, so relying on cache_dir too much
> will
> decrease your hit ratio, especially in a collapsed forwarding
> environment.
>
> Is your Squid built with --enable-delay-pools? If yes, TCP_MISS does
> not
> necessarily mean a cache miss (an old Squid bug), even if you do not
> use
> any delay pools.
>
> Since you are trying to cache objects lager than 512KB, see
> maximum_object_size_in_memory.
>
> Consider making your test much longer (more sequential requests per
> client/curl worker), to see whether the cache becomes "stable" after
> one
> of the first transactions manages to fully cache the response. This may
> not help with older Squids, but might help with newer ones. However,
> you
> should not test using real origin servers (that you do not control)!
>
>
> > Could you clarify if this behavior of my squid is
> > a bug/misconfiguration, or if I'm running into server performance
> > limitations (squid is running on a VM with 22 cores)?
>
> Most likely, reduction of hit ratio with increase of concurrency is
> _not_ a performance limitation.
>
>
> HTH,
>
> Alex.
>
>
> > I selected a couple of cacheable resources in the internet for
> testing:
> > - small size (~400 KB):
> >
> https://ia800406.us.archive.org/13/items/romeo-y-julieta-texto-completo/Romeo%20y%20Julieta%20-%20William%20Shakespeare.pdf <https://ia800406.us.archive.org/13/items/romeo-y-julieta-texto-completo/Romeo%20y%20Julieta%20-%20William%20Shakespeare.pdf> <https://ia800406.us.archive.org/13/items/romeo-y-julieta-texto-completo/Romeo%20y%20Julieta%20-%20William%20Shakespeare.pdf <https://ia800406.us.archive.org/13/items/romeo-y-julieta-texto-completo/Romeo%20y%20Julieta%20-%20William%20Shakespeare.pdf>>
> > - large (~8 MB):
> >
> https://ia600601.us.archive.org/10/items/Linux-Journal-2015-01/Linux-Journal-2015-01.pdf <https://ia600601.us.archive.org/10/items/Linux-Journal-2015-01/Linux-Journal-2015-01.pdf> <https://ia600601.us.archive.org/10/items/Linux-Journal-2015-01/Linux-Journal-2015-01.pdf <https://ia600601.us.archive.org/10/items/Linux-Journal-2015-01/Linux-Journal-2015-01.pdf>>
> > To test simultaneous connections I am forking curl using a simple
> script
> > (it is also shown below).
> >
> > When I run a test (500 curl threads to
> >
> https://ia800406.us.archive.org/13/items/romeo-y-julieta-texto-completo/Romeo%20y%20Julieta%20-%20William%20Shakespeare.pdf <https://ia800406.us.archive.org/13/items/romeo-y-julieta-texto-completo/Romeo%20y%20Julieta%20-%20William%20Shakespeare.pdf> <https://ia800406.us.archive.org/13/items/romeo-y-julieta-texto-completo/Romeo%20y%20Julieta%20-%20William%20Shakespeare.pdf <https://ia800406.us.archive.org/13/items/romeo-y-julieta-texto-completo/Romeo%20y%20Julieta%20-%20William%20Shakespeare.pdf>>) I see lots of TCP_MISS/200 with FIRSTUP_PARENT/parent_proxy records in the logs.
> >
> > A simple analysis shows a low percentage of cache hits:
> > cat /var/log/squid.user/access.log| grep '2023-05-29 14' | grep
> pdf |
> > awk '{print $5" " $10}' | sort | uniq -c
> > 24 TCP_CF_MISS/200/- HIER_NONE/-
> > 457 TCP_MISS/200/200 FIRSTUP_PARENT/parent_proxy
> > 10 TCP_MISS/200/- HIER_NONE/-
> > 9 TCP_SWAPFAIL_MISS/200/200 FIRSTUP_PARENT/parent_proxy
> >
> > So the Hit ratio is about (500-457-9)*100/500=6.8%
> >
> > Almost the same situation we see when run 200 threads:
> > cat /var/log/squid.user/access.log| grep '2023-05-29 15:45' |
> grep pdf
> > | awk '{print $5" " $10}' | sort | uniq -c
> > 4 TCP_CF_MISS/200/- HIER_NONE/-
> > 140 TCP_MISS/200/200 FIRSTUP_PARENT/parent_proxy
> > 40 TCP_MISS/200/- HIER_NONE/-
> > 16 TCP_SWAPFAIL_MISS/200/200 FIRSTUP_PARENT/parent_proxy
> >
> > This time the Hit ratio is about (200-140-16)*100/500=21%
> >
> > With 50 threads the Hit ratio is 90%:
> > cat /var/log/squid.user/access.log| grep '2023-05-29 15:50' |
> grep pdf
> > | awk '{print $5" " $10}' | sort | uniq -c
> > 27 TCP_CF_MISS/200/- HIER_NONE/-
> > 1 TCP_MISS/200/200 FIRSTUP_PARENT/parent_proxy
> > 18 TCP_MISS/200/- HIER_NONE/-
> > 4 TCP_SWAPFAIL_MISS/200/200 FIRSTUP_PARENT/parent_proxy
> >
> > I thought that it should always be near 99% - only the first
> request to
> > an URL should be forwarded to the parent proxy and all subsequent
> > requests should be served from the cache.
> >
> > The situation is even worse with downloading a large file:
> > 500 threads (0.4%):
> > cat /var/log/squid.user/access.log| grep '2023-05-29 17:2' | grep
> pdf |
> > awk '{print $5" " $10}' | sort | uniq -c
> > 10 TCP_CF_MISS/200/200 FIRSTUP_PARENT/parent_proxy
> > 2 TCP_CF_MISS/200/- HIER_NONE/-
> > 488 TCP_MISS/200/200 FIRSTUP_PARENT/parent_proxy
> >
> > 200 threads (3%):
> > cat /var/log/squid.user/access.log| grep '2023-05-29 17:3' | grep
> pdf |
> > awk '{print $5" " $10}' | sort | uniq -c
> > 9 TCP_CF_MISS/200/200 FIRSTUP_PARENT/parent_proxy
> > 6 TCP_CF_MISS/200/- HIER_NONE/-
> > 180 TCP_MISS/200/200 FIRSTUP_PARENT/parent_proxy
> > 5 TCP_SWAPFAIL_MISS/200/200 FIRSTUP_PARENT/parent_proxy
> >
> > 50 threads (98%):
> > cat /var/log/squid.user/access.log| grep '2023-05-29 17:36' |
> grep pdf
> > | awk '{print $5" " $10}' | sort | uniq -c
> > 25 TCP_CF_HIT/200/- HIER_NONE/-
> > 12 TCP_CF_MISS/200/- HIER_NONE/-
> > 12 TCP_HIT/200/- HIER_NONE/-
> > 1 TCP_MISS/200/200 FIRSTUP_PARENT/parent_proxy
> >
> > Could you clarify if this behavior of my squid is a
> > bug/misconfiguration, or if I'm running into server performance
> > limitations (squid is running on a VM with 22 cores)?
> >
> > Kind regards,
> > Ankor
> >
> >
> >
> > *squid.conf:*
> > workers 21
> >
> > sslcrtd_program
> /data/squid.user/usr/lib/squid/security_file_certgen -s
> > /data/squid.user/var/lib/squid/ssl_db -M 20MB
> > sslcrtd_children 21
> >
> > logformat extended-squid %{%Y-%m-%d %H:%M:%S}tl| %6tr %>a
> > %Ss/%03>Hs/%<Hs %<st %rm %ru %un %Sh/%<A %mt %ea
> >
> > logfile_rotate 0
> > access_log daemon:/var/log/squid.user/access.log
> > logformat=extended-squid on-error=drop
> >
> > cache_peer parent_proxy parent 3128 0
> > never_direct allow all
> >
> > cachemgr_passwd pass config
> >
> > acl PURGE method PURGE
> > http_access allow PURGE
> >
> > http_access allow all
> >
> > http_port 3131 ssl-bump generate-host-certificates=on
> > dynamic_cert_mem_cache_size=20MB
> > tls-cert=/etc/squid.user/sslbump/bump.crt
> > tls-key=/etc/squid.user/sslbump/bump.key
> > sslproxy_cert_error allow all
> >
> > acl step1 at_step SslBump1
> > acl step2 at_step SslBump2
> > acl step3 at_step SslBump3
> >
> > ssl_bump peek step1
> > ssl_bump bump step2
> > ssl_bump bump step3
> >
> > cache_dir rock /data/squid.user/cache 20000 max-size=12000000
> > cache_swap_low 85
> > cache_swap_high 90
> >
> > *collapsed_forwarding on*
> >
> > pinger_enable off
> > max_filedesc 8192
> > shutdown_lifetime 5 seconds
> > netdb_filename none
> > log_icp_queries off
> > client_request_buffer_max_size 100 MB
> >
> > via off
> > forwarded_for delete
> >
> > coredump_dir /data/squid.user/var/cache/squid
> >
> > *curl_forker.sh:*
> > #!/bin/sh
> > N=100
> >
> URL=https://ia600601.us.archive.org/10/items/Linux-Journal-2015-01/Linux-Journal-2015-01.pdf <https://ia600601.us.archive.org/10/items/Linux-Journal-2015-01/Linux-Journal-2015-01.pdf> <https://ia600601.us.archive.org/10/items/Linux-Journal-2015-01/Linux-Journal-2015-01.pdf <https://ia600601.us.archive.org/10/items/Linux-Journal-2015-01/Linux-Journal-2015-01.pdf>>
> >
> > if [[ -n $1 && $1 =~ help$ ]];
> > then
> > echo "Usage: $0 [<cnt>] [<url>]"
> > echo
> > echo "Example: $0 10
> >
> https://ia600601.us.archive.org/10/items/Linux-Journal-2015-01/Linux-Journal-2015-01.pdf <https://ia600601.us.archive.org/10/items/Linux-Journal-2015-01/Linux-Journal-2015-01.pdf> <https://ia600601.us.archive.org/10/items/Linux-Journal-2015-01/Linux-Journal-2015-01.pdf <https://ia600601.us.archive.org/10/items/Linux-Journal-2015-01/Linux-Journal-2015-01.pdf>>";
> > echo
> > exit;
> > fi
> >
> > while [[ $# -gt 0 ]]
> > do
> > if [[ $1 =~ ^[0-9]+$ ]]
> > then
> > N=$1
> > else
> > URL=$1
> > fi
> > shift
> > done
> >
> > echo $URL
> > echo $N threads
> >
> > for i in `seq $N`
> > do
> > nohup curl --tlsv1.2 -k --proxy 0001vsg01:3131 -v $URL
> >/dev/null
> > 2>&1 &
> > done
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> > _______________________________________________
> > squid-users mailing list
> > squid-users at lists.squid-cache.org
> <mailto:squid-users at lists.squid-cache.org>
> > http://lists.squid-cache.org/listinfo/squid-users
> <http://lists.squid-cache.org/listinfo/squid-users>
>
> _______________________________________________
> squid-users mailing list
> squid-users at lists.squid-cache.org
> <mailto:squid-users at lists.squid-cache.org>
> http://lists.squid-cache.org/listinfo/squid-users
> <http://lists.squid-cache.org/listinfo/squid-users>
>
More information about the squid-users
mailing list