[squid-dev] [PATCH] Better support for unknown URL schemes

Alex Rousskov rousskov at measurement-factory.com
Tue Mar 15 16:52:12 UTC 2016


On 03/15/2016 09:36 AM, Amos Jeffries wrote:
> Squid already contains AnyP::PROTO_UNKNOWN support for unknown protocols
> but currently does not preserve the actual string value received for them.
> 
> This adds a textual representation ('image') to the UriScheme object to
> fill that gap and ensure that all URL representatinos (ie cache keys,
> logs and outgoing messages) are generated with the scheme string as it
> was received.

Preserving textual representation is the right thing to do, but the
implementation itself needs a lot more work IMO.


>      /// convert the URL scheme to that given
>      void setScheme(const AnyP::ProtocolType &p) {scheme_=p; touch();}
> +    void setSchemeImage(const char *str) {scheme_.setImage(str); touch();}


The parameter type should be changed instead. Yes, that will require
more changes, but those changes are essential to properly support
foreign protocol names. Without those changes, you are creating an API
bug in addition to adding a useful feature.

We (mostly you) have already done similar work for foreign HTTP request
methods. This is similar -- UriScheme should be used nearly everywhere
(instead of ProtocolType that does not carry enough information). It
could be argued that the UriScheme class itself should be renamed to
something like Protocol, but I do not want to argue about that.


> +    /// Sets the string representation of this scheme.
> +    /// Only needed if the scheme type is PROTO_UNKNOWN.
> +    void setImage(const char *str) {assert(theScheme_ == AnyP::PROTO_UNKNOWN); schemeImage_ = str;}

This method should be a constructor.


>  char const *
>  AnyP::UriScheme::c_str() const
>  {

This method should be replaced with AnyP::UriScheme::image() returning SBuf.


> +    if (!schemeImage_.isEmpty())
> +        return schemeImage_.c_str();
> +

Why are we not lowering the case of a foreign protocol image if we are
lowering the case of standard protocol images below? Either both should
be lowered or there should be a source code comment explaining the
discrepancy.

If both should be lowered, the lowering should be done in the
constructor to avoid lowering the same string multiple times.


>      if (theScheme_ > AnyP::PROTO_NONE && theScheme_ < AnyP::PROTO_MAX) {
> -        const char *in = AnyP::ProtocolType_str[theScheme_];
> -        for (; p < (BUFSIZ-1) && in[p] != '\0'; ++p)
> -            out[p] = xtolower(in[p]);
> +        schemeImage_ = AnyP::ProtocolType_str[theScheme_];
> +        schemeImage_.toLower();
>      }

This slow code may not be needed at all if the rest of the changes are
implemented. If this code stays, please add an "XXX: Slow." or a similar
comment after toLower(). There are many ways to fix/optimize this, but
the best way probably depends on the rest of the changes so I am not
going to suggest any specific TODO here.


> @@ -157,6 +158,9 @@
>      if (strncasecmp(b, "whois", len) == 0)
>          return AnyP::PROTO_WHOIS;
>  
> +    if (len > 0)
> +        return AnyP::PROTO_UNKNOWN;
> +
>      return AnyP::PROTO_NONE;
>  }

This function should return a UriScheme object instead. Ideally, using
created-once constants for common protocols (so that we do not have to
deal with to-lower-case conversion every time a scheme image is needed).


> @@ -27,7 +28,7 @@
>      ~UriScheme() {}
...
> +    /// the string representation to use for theScheme_
> +    mutable SBuf schemeImage_;

Please do not name Foo members FooX. One Foo is enough!


> +    /// the string representation to use for theScheme_

s/to use for theScheme_//
or
s/to use for theScheme_/of the scheme//

If you can make such a guarantee after all other changes, please add
something like "; always in lower case".


> +    mutable SBuf schemeImage_;

Mutable may not be needed after all other changes.


It is difficult for me to review 4-line no-function-names diff, so
forgive me if I missed any context-dependent caveats.


Thank you,

Alex.



More information about the squid-dev mailing list