[squid-dev] [PATCH] improve CharacterSet

Alex Rousskov rousskov at measurement-factory.com
Sun Dec 27 01:50:12 UTC 2015


On 12/26/2015 09:03 AM, Kinkie wrote:
> Hi all,
>   the attached patch improves CharacterSet by adding a subtraction
> operator, equality test and associated unit tests.


Perhaps this is different in C++11, but IIRC we are supposed to:

1. declare non-modifying operators such as "-" outside the class

2. declare an inequality operator "!=" when declaring an equality operator

For #1, you can claim code consistency as an excuse because the "+"
operator was already misplaced, but since no existing or proposed code
uses the new operators (AFAIK), it seems like it would be a good idea to
fix the wrong location before (or while) adding the new [unused] ones.


> +    if (this == &c) // identity test
> +        return true;

I would remove this check as unneeded unless you are sure that the
comparison with self is common enough to optimize for. If you keep this,
please s/identity test/optimization: fast comparison with self/ because
I do not think the term "identity test" is standard [enough] to be
recognized by many readers and googling it does not help.


> +    auto e=chars_.cend();
> +    for (auto i = chars_.cbegin(), j = c.chars_.cbegin(); i != e; ++i, ++j) {
> +        if (*i != *j)
> +            return false;
> +    }

Is the existing Storage::operator "==" so bad that we need to hand-roll
a loop [with a critical hidden assumption] like that? If hand-rolling is
needed, please at least add a comment that the code assumes that all
chars_ containers have the same length.


> +    /// remove all characters from the given CharacterSet to this one
> +    CharacterSet &operator -=(const CharacterSet &src);

Please remove the "to this one" copy-paste typo.

I recommend rephrasing the description to document what happens when the
character cannot be removed because it is not in the "this" set. For
example:

/// set subtraction: remove all characters that are also in src


"src" is a weird name for the parameter subtraction operator argument.
You can cite code consistency as an excuse but consider using "other" or
perhaps something like "banned" instead.


>      /// return a new CharacterSet containing the union of two sets, labeled as the first argument
>      CharacterSet operator +(const CharacterSet &src) const;
...
> +    /// return a new CharacterSet containing the set of characters in the first but not in the second
> +    CharacterSet operator -(const CharacterSet &src) const;


It is funny that you preserved many bad things but decided to drop
documentation about the label. Please add it.

Please note that neither the old operator "+" nor the new operator "-"
have two [explicit] arguments [until they are moved outside the class as
discussed in #1 above] so the comment using words "first" and "second"
is rather confusing!


> +static
> +std::ostream& operator<< (std::ostream &s, const CharacterSet &c)
> +{
> +    s << "CharacterSet(" << c.name << ')';
> +    return s;
> +}
> +

Perhaps this should be provided in the CharacterSet header, for all to
use? You would have to split the definition from the declaration to
avoid dragging iostream into the header, but this operator feels useful
to me, even in its current "primitive" form.


> -        for (int j = 0; j < 255; ++j)
> -            if (j != '0')
> +        for (int j = 0; j < 255; ++j) {
> +            if (j != '0') {
>                  CPPUNIT_ASSERT_EQUAL(false,t[j]);
> +            } else {
> +                CPPUNIT_ASSERT_EQUAL(true,t[j]);
> +            }
> +        }

If polishing this code is in scope, consider fixing the loop boundary as
well: There is nothing so special about character with codepoint 255 to
exclude it AFAICT.


> +void
> +testCharacterSet::CharacterSetSubtract()
> +{
> +    CharacterSet sample(NULL, "0123456789aAbBcCdDeEfFz");
> +    sample -= CharacterSet(NULL, "z");
> +    CPPUNIT_ASSERT_EQUAL(CharacterSet::HEXDIG, sample);
> +}


It would be good to have an A-B test where B has [some] characters not
present in A.

Since operator "-" operator is implemented using operator "-=" it may be
better to test the former (if we have to pick between testing one or the
other).


HTH,

Alex.



More information about the squid-dev mailing list