[squid-users] Squid and multipart form decode

Alex Rousskov rousskov at measurement-factory.com
Wed Jul 29 17:49:41 UTC 2020


On 7/29/20 12:47 PM, Ryan Le wrote:

>>> Even though it looks like TeChunkedParser is getting all the
>>> additional headers

>> TeChunkedParser has nothing to do with multipart/form-data bodies.

> I do see it in two locations.

> 2020/07/26 23:11:12.921 kid6| 74,9| TeChunkedParser.cc(45) parse: Parse
> buf={length=3667, data='e47
> -----------------------------351645264024548376901231954897
> Content-Disposition: form-data; name="action"
...

You see the parser reporting the raw input buffer that it is about to
parse. The parser will treat everything you see after the first "e47"
line (which specifies the chunk size in hex) as opaque body bytes (until
the start of the next chunk metadata).


> As well as the following location

> 2020/07/26 23:11:12.921 kid6| 58,9| HttpMsg.cc(198) parse:
> HttpMsg::parse success (689 bytes) near 'POST http://bbbb.com/post HTTP/1.1
> User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0)
...

Same problem. It is just a raw input buffer dump.


> What I am ultimately trying to accomplish is to see the best way to get
> more detail and have an action on sites 
> that are posting using multipart/form-data as the Content-Type header.

ACL-driven actions based on the Content-Type header value should work
fine. Logging of the Content-Type header value to access.log should work
fine.  If something does not work, please provide a specific non-working
configuration example.

Some of your earlier messages sounded like you want Squid to act based
on MIME headers inside the message body or even the values of HTML form
entries. Squid cannot do that on its own. To analyze message bodies
(even in read-only mode), you will need a custom ICAP or eCAP service:
https://wiki.squid-cache.org/SquidFaq/ContentAdaptation


HTH,

Alex.



> On Wed, Jul 29, 2020 at 12:16 PM Alex Rousskov wrote:
> 
>     On 7/29/20 11:38 AM, Ryan Le wrote:
>     > Even though it looks like TeChunkedParser is getting all the
>     > additional headers
> 
>     TeChunkedParser has nothing to do with multipart/form-data bodies.
>     TeChunkedParser parses chunked encoding, and even then it is applied to
>     remove _transfer_ encoding, not to interpret the actual resource content
>     inside the chunks.
> 
>     I am not sure, but it looks like you have pasted a part of an ICAP
>     message. TeChunkedParser is used to parse chunked transfer encoding used
>     for a part of the ICAP message body. Beyond decoding those chunks, it is
>     all opaque data to Squid.
> 
>     To avoid misunderstanding, in your pasted example, the contents of the
>     first chunk starts with these two lines:
> 
>     > -----------------------------328901485836611227811186534509
>     > Content-Disposition: form-data; name="action"
> 
>     It does _not_ start with the "Content-Disposition:..." line or the
>     "frm_submit_dropzone" line.
> 
> 
>     > I can't seem to create ACL or output them using
>     > logformat. I was trying to request these headers with
>     > req_mime_type/resp_mime_type.
> 
>     If by "them" you mean MIME headers inside multipart parts, then Squid
>     does not see them and does not operate on them. The insides of each
>     chunk is opaque data to Squid.
> 
> 
>     > and alos had log_mime_hdrs on and then in
>     > logformat just had all.
> 
>     You should be able to log the HTTP request header values using %>h or
>     %>ha. You will not be able to log or match any message body snippets,
>     including things like MIME Content-Disposition values. Squid does not
>     look inside the body of the POSTed resource.
> 
> 
>     If you need further help, you may want to clarify what you are trying to
>     achieve. You said "send multipart form data to another service". Are you
>     trying to _route_ request messages based on multipart form _contents_?
> 
> 
>     HTH,
> 
>     Alex.
> 
> 
>     > On Thu, Jul 23, 2020 at 11:46 AM Ryan Le wrote:
>     >
>     >     Thanks, 
>     >
>     >     I have been looking at the squid debug and can see that it is
>     >     getting the multipart.
>     >
>     >     POST http://bbbbbb.com
>     >     User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0)
>     >     Gecko/20100101 Firefox/78.0
>     >     Accept: application/json
>     >     Accept-Language: en-US,en;q=0.5
>     >     Accept-Encoding: gzip, deflate
>     >     Referer: http://bbbbb.com
>     >     Cache-Control: no-cache
>     >     X-Requested-With: XMLHttpRequest
>     >     Content-Type: multipart/form-data;
>     >     boundary=---------------------------328901485836611227811186534509
>     >     Content-Length: 1245
>     >     Origin: http://bbbbb.com
>     >     Cookie: cookie
>     >     Host: bbbbbbb.com <http://bbbbbbb.com> <http://bbbbbbb.com>
>     >     Via: ICAP/1.0 
>     >
>     >     4dd
>     >     -----------------------------328901485836611227811186534509
>     >     Content-Disposition: form-data; name="action"
>     >
>     >     frm_submit_dropzone
>     >     -----------------------------328901485836611227811186534509
>     >     Content-Disposition: form-data; name="field_id"
>     >
>     >     8
>     >     -----------------------------328901485836611227811186534509
>     >     Content-Disposition: form-data; name="form_id"
>     >
>     >     5
>     >     -----------------------------328901485836611227811186534509
>     >     Content-Disposition: form-data; name="nonce"
>     >
>     >     e1aca92777
>     >     -----------------------------328901485836611227811186534509
>     >     Content-Disposition: form-data; name="file8";
>     filename="translate.zip"
>     >     Content-Type: application/x-zip-compressed
>     >
>     >     On Thu, Jul 23, 2020 at 11:16 AM Alex Rousskov
>     >     <rousskov at measurement-factory.com
>     <mailto:rousskov at measurement-factory.com>
>     >     <mailto:rousskov at measurement-factory.com
>     <mailto:rousskov at measurement-factory.com>>> wrote:
>     >
>     >         On 7/23/20 9:22 AM, Ryan Le wrote:
>     >         > I have been trying to configure squid to decode and send
>     >         multipart form
>     >         > data to another service. Is there an acl or build parameter
>     >         needed for
>     >         > multipart form data support?
>     >
>     >         No, there is no need to allow any specific Content-Type,
>     including
>     >         multipart. Squid does not know anything about
>     >         multipart/form-data. If a
>     >         multipart/form-data message is well-formed from HTTP point of
>     >         view, then
>     >         Squid will process it as any other message, including
>     passing it to
>     >         ICAP/eCAP (where configured).
>     >
>     >         Cheers,
>     >
>     >         Alex.
>     >
> 



More information about the squid-users mailing list