[squid-users] gzip deflate

Amos Jeffries squid3 at treenet.co.nz
Thu Mar 17 14:13:48 UTC 2016


On 17/03/2016 10:56 p.m., joe wrote:
> Alex Rousskov wrote
>> On 03/16/2016 02:21 AM, joe wrote:
>>> You need to direct messages to the service(s) using adaptation_access
>>> directives:
>>> isn't faster if we use gzip library instead that will minimize the
>>> redirect
>>> ms..direct decompress
>>
>> Virtually everything would be faster if done directly in Squid. The
>> processing speed is not the primary acceptance criteria, or we would
>> never have ICAP, eCAP, URL rewriters, authentication helpers, SSL
>> certificate validators, and other "helpers".
>>
>> Compressing and uncompressing content on-the-fly is a lot more difficult
>> than adding a couple of zlib function calls. It probably also requires
>> several configuration/tuning options to accommodate various deployment
>> environments. To make it worth implementing inside Squid, decompression
>> has to be a very frequent feature request OR it has to be nearly
>> impossible to do well outside of Squid (while still being reasonably
>> popular). AFAICT, neither is true but please submit a more detailed
>> proposal if I am wrong.
>>
>>
>> Thank you,
>>
>> Alex.
>>
> 
> it will be start for the future technologies  squid-gzip-deflate
> Content-Encoding can be hook the code to decomp... insted of using ecap
> 

Squid is a proxy. Proxies are supposed to participate in
Transfer-Encoding, not Content-Encoding. Only user agents and origin
servers are supposed to participate in Content-Encoding.

The future technology we are trying to aim Squid at is HTTP/2. In h2
Transfer-Encoding takes the form of DATA frames which are compressed
individually as a unit independent of other DATA frames in the same
stream. This is *much* easier to code than HTTP/1.x stream compression.
 But we need to get h2 implemented first.


> 
> yes there is a lot of work to have it fuly working but its start to the
> develop....to continue working on it
> there is lots a free snipe code arount and not hard to use it it fit in
> recent squid code
> 

The first problem with all the code below is that it uses std::string.
Squid does not use std::string for I/O buffered content.

Translating the content from StoreIOBuffer type into std::string, then
compressing into another std::string, then copying back into
StoreIOBuffer for delivery is ~60% slower than normal Squid delivery.


The second problem is that it assumes the data is all known before
compress/decompress starts.

HTTP traffic is not all arriving in one block. Squid may receive the
content asynchronously in chunks as small as 1 single byte per I/O
cycle. Any codec needs to be able to cope with that variable and
streamed flow of data without resorting to errors.



> just a sample code to adapt can be start of the work
> #include <string>
> #include <sstream>
> #include <stdexcept>
> #include <string.h>
> #include "zlib.h"
> 
> using std::string;
> using std::stringstream;
> 
> // Found these here
> http://mail-archives.apache.org/mod_mbox/trafficserver-dev/201110.mbox/%3CCACJPjhYf=+br1W39vyazP=ix
> //eQZ-4Gh9-U6TtiEdReG3S4ZZng at mail.gmail.com%3E
> #define MOD_GZIP_ZLIB_WINDOWSIZE 15
> /** Decompress an STL string using zlib and return the original data. */
> std::string decompress_deflate(const std::string& str)
> {
>     z_stream zst;                        // z_stream is zlib's control
> structure
>     memset(&zst, 0, sizeof(zst));
> 
>     if (inflateInit(&zst) != Z_OK)
>     debugs(98,1, "inflateInit failed while decompressing.");
>     zst.next_in = (Bytef*)str.data();
>     zst.avail_in = str.size();
>     int ret;
>     char outbuffer[32768];
>     std::string outstring;
> 
>     // get the decompressed bytes blockwise using repeated calls to inflate
>     do {
>         zst.next_out = reinterpret_cast<Bytef*>(outbuffer);
>         zst.avail_out = sizeof(outbuffer);
>         ret = inflate(&zst, 0);
>         if (outstring.size() < zst.total_out) {
>             outstring.append(outbuffer,
>                              zst.total_out - outstring.size());
>         }
> 
>     } while (ret == Z_OK);
> 
>     inflateEnd(&zst);
> 
>     if (ret != Z_STREAM_END) {          // an error occurred that was not
> EOF
>         std::ostringstream oss;
> 		 debugs(98,2, oss << "Exception during deflate decompression: (" << ret <<
> ") " << zst.msg);
>     }
> 
>     return outstring;
> }
> 
> std::string decompress_gzip(const std::string& str)
> {
>     z_stream zst;                        // z_stream is zlib's control
> structure
>     memset(&zst, 0, sizeof(zst));
>     if (inflateInit2(&zst, MOD_GZIP_ZLIB_WINDOWSIZE + 16) != Z_OK)
> 	    debugs(98,1, "error failed while decompressing.");
>     zst.next_in = (Bytef*)str.data();
>     zst.avail_in = str.size();
>     int ret;
>     char outbuffer[32768];
>     std::string outstring;
> 
>     // get the decompressed bytes blockwise using repeated calls to inflate
>     do {
>         zst.next_out = reinterpret_cast<Bytef*>(outbuffer);
>         zst.avail_out = sizeof(outbuffer);
>         ret = inflate(&zst, 0);
> 
>         if (outstring.size() < zst.total_out) {
>             outstring.append(outbuffer, zst.total_out - outstring.size());
>         }
> 
>     } while (ret == Z_OK);
> 
>     inflateEnd(&zst);
> 
>     if (ret != Z_STREAM_END) {          // an error occurred that was not
> EOF
>         std::ostringstream oss;
> 		debugs(98,2, oss << "Exception during zlib decompression: (" << ret << ")
> " << zst.msg);	
>     }
> 
>     return outstring;
> }
> 
> =========
>  i know squid its hard coded to work with but its a start nothing essay but
> it will be step to the future better then waiting for apps in the web to
> become fully compressed and like the most preferred that then ecap  
> 

We are not saying its hard to use the library. We are saying that to get
this project happening you first have to design how something like the
above is going to fit into Squid.

Doing the planning/design work will show you how the gzip code will
actually have to be written.


I have done a lot of the design and preparation work for adding TE:gzip.
But its not quite finished yet. The plan was to hook a TeGzipDecoder
object into the TeChunkedParser, so that each block of data to be
de-chunked was then decompressed, with the result being added to the
output I/O buffer.

Amos



More information about the squid-users mailing list