[squid-dev] [PATCH] improve CharacterSet
Kinkie
gkinkie at gmail.com
Sun Dec 27 09:37:04 UTC 2015
On Sun, Dec 27, 2015 at 2:50 AM, Alex Rousskov
<rousskov at measurement-factory.com> wrote:
> On 12/26/2015 09:03 AM, Kinkie wrote:
>> Hi all,
>> the attached patch improves CharacterSet by adding a subtraction
>> operator, equality test and associated unit tests.
>
Hi Alex, thanks for taking the time.
> Perhaps this is different in C++11, but IIRC we are supposed to:
>
> 1. declare non-modifying operators such as "-" outside the class
That's typical but not mandatory. Changing all to reflect
> 2. declare an inequality operator "!=" when declaring an equality operator
If unequality is not tested.. Still, adding it is trivial, so I'm doing that.
> For #1, you can claim code consistency as an excuse because the "+"
> operator was already misplaced, but since no existing or proposed code
> uses the new operators (AFAIK), it seems like it would be a good idea to
> fix the wrong location before (or while) adding the new [unused] ones.
Done.
>
>
>> + if (this == &c) // identity test
>> + return true;
>
> I would remove this check as unneeded unless you are sure that the
> comparison with self is common enough to optimize for. If you keep this,
> please s/identity test/optimization: fast comparison with self/ because
> I do not think the term "identity test" is standard [enough] to be
> recognized by many readers and googling it does not help.
>
>
>> + auto e=chars_.cend();
>> + for (auto i = chars_.cbegin(), j = c.chars_.cbegin(); i != e; ++i, ++j) {
>> + if (*i != *j)
>> + return false;
>> + }
>
> Is the existing Storage::operator "==" so bad that we need to hand-roll
> a loop [with a critical hidden assumption] like that? If hand-rolling is
> needed, please at least add a comment that the code assumes that all
> chars_ containers have the same length.
That's ignorance on my side; I now checked that
std::vector::operator== compares elements; replacing whole method with
call to that.
The assumption is not hidden, it's an invariant enforced by all constructors.
>> + /// remove all characters from the given CharacterSet to this one
>> + CharacterSet &operator -=(const CharacterSet &src);
>
> Please remove the "to this one" copy-paste typo.
>
> I recommend rephrasing the description to document what happens when the
> character cannot be removed because it is not in the "this" set. For
> example:
>
> /// set subtraction: remove all characters that are also in src
>
>
> "src" is a weird name for the parameter subtraction operator argument.
> You can cite code consistency as an excuse but consider using "other" or
> perhaps something like "banned" instead.
rhs? that's fairly common for operators
>> /// return a new CharacterSet containing the union of two sets, labeled as the first argument
>> CharacterSet operator +(const CharacterSet &src) const;
> ...
>> + /// return a new CharacterSet containing the set of characters in the first but not in the second
>> + CharacterSet operator -(const CharacterSet &src) const;
>
>
> It is funny that you preserved many bad things but decided to drop
> documentation about the label. Please add it.
>
> Please note that neither the old operator "+" nor the new operator "-"
> have two [explicit] arguments [until they are moved outside the class as
> discussed in #1 above] so the comment using words "first" and "second"
> is rather confusing!
Moved outside class; renamed parameters to lhs and rhs, reworked
doxygen comment.
>> +static
>> +std::ostream& operator<< (std::ostream &s, const CharacterSet &c)
>> +{
>> + s << "CharacterSet(" << c.name << ')';
>> + return s;
>> +}
>> +
>
> Perhaps this should be provided in the CharacterSet header, for all to
> use? You would have to split the definition from the declaration to
> avoid dragging iostream into the header, but this operator feels useful
> to me, even in its current "primitive" form.
I've thought about it, and decided not to, as it'd pull in iostream to
all users of CharacterSet.
This is mostly meant as a convenience to allow using
CPPUNIT_ASSERT_EQUAL which is more telling that CPPUNIT_ASSERT but
requires being able to print arguments.
>
>
>> - for (int j = 0; j < 255; ++j)
>> - if (j != '0')
>> + for (int j = 0; j < 255; ++j) {
>> + if (j != '0') {
>> CPPUNIT_ASSERT_EQUAL(false,t[j]);
>> + } else {
>> + CPPUNIT_ASSERT_EQUAL(true,t[j]);
>> + }
>> + }
>
> If polishing this code is in scope, consider fixing the loop boundary as
> well: There is nothing so special about character with codepoint 255 to
> exclude it AFAICT.
Right.
>> +void
>> +testCharacterSet::CharacterSetSubtract()
>> +{
>> + CharacterSet sample(NULL, "0123456789aAbBcCdDeEfFz");
>> + sample -= CharacterSet(NULL, "z");
>> + CPPUNIT_ASSERT_EQUAL(CharacterSet::HEXDIG, sample);
>> +}
>
>
> It would be good to have an A-B test where B has [some] characters not
> present in A.
>
> Since operator "-" operator is implemented using operator "-=" it may be
> better to test the former (if we have to pick between testing one or the
> other).
Ok.
Updated patch attached; thanks again!
--
Francesco
-------------- next part --------------
A non-text attachment was scrubbed...
Name: characterset-improvements-v2.patch
Type: text/x-patch
Size: 16539 bytes
Desc: not available
URL: <http://lists.squid-cache.org/pipermail/squid-dev/attachments/20151227/439d5326/attachment-0001.bin>
More information about the squid-dev
mailing list