[squid-dev] [PATCH] TidyPointer removal

Fri Jul 8 06:44:37 UTC 2016

On 8/07/2016 11:24 a.m., Alex Rousskov wrote:
> On 07/07/2016 01:40 PM, Amos Jeffries wrote:
>> On 4/07/2016 5:39 a.m., Alex Rousskov wrote:
>>>> +    /// Reset raw pointer - delete last one and save new one.
>>>> +    void reset(T *t) {
>>>> +        deletePointer();
>>>> +        raw = t;
>>>>      }
> 
> 
>>> I do not think we can have a reset() method like this because the
>>> standard shared_ptr::reset() has a very different semantics. Let's call
>>> this absorb(). This is like a raw pointer assignment operator, but we
>>> want to keep it "explicit" because we think this is a dangerous operation.
> 
> 
>> What different semantics do you see in there?
> 
> Good question! This issue is confusing for me, so I apologize if this
> explanation is messy:
> 
> Facts:
> 
> * resetWithoutLocking(x) does not increment the reference counter for x.
> * resetAndLock(x) increments the reference counter for x.
> 
> What does the standard shared_ptr::reset(x) method do?
> 
> * If shared_ptr counts all references to x, internal and external, then
> shared_ptr::reset(x) is like our resetAndLock(x) -- it increments the
> counter.
> 
> * If standard shared_ptr only counts external references, then
> shared_ptr::reset(x) is like our resetWithoutLocking(x) -- it does not
> increment the counter.
> 
> This logic is totally flawed, on several levels, but this trap might
> help illustrating why I find this issue confusing.
> 
> To fix our logic, we should not misrepresent what shared_ptr does
> internally, but look at its API instead. The standard API guarantees
> that the following sequence leads to object X destruction:
> 
>   {
>       std::shared_ptr s;
>       X *x = createX();
>       s.reset(x);
>       // s is destroyed upon exiting its scope
>   }
> 
> Which of our two LockingPointer::reset*() methods provides the same
> guarantee?
> 
> The correct answer is: None! In our case, the reset*() caller has to
> know createX() details to call the right reset*() method, even though
> createX() returns a "raw" pointer that knows nothing about our
> LockingPointer. Similarly, the caller must know createX() details to
> decide whether it is safe to call the LockingPointer constructor!
> 
> This mess happens because, in some cases, OpenSSL effectively returns an
> already managed pointer, even though it does not know about our
> LockingPointer. Or, in other words, it happens because we [have to]
> share the pointer management mechanism with OpenSSL, but that mechanism
> does not come in a form of a smart pointer -- it is hidden inside the X
> object itself.
> 
> With a standard API, there is only one reset(x). In our case, we have to
> have two. Thus, none of them can be called reset()! We have to force the
> caller to make a choice, depending on the context (i.e., the source of x).
> 
> BTW, your decision to name both methods reset*() is superior to my
> earlier suggestion to name them absorb() and reset(). Let's leave your
> reset*() names intact.
> 

:-)

> We may also want to hide the constructor behind two static methods for
> similar reasons -- whether X is properly destructed in the following
> example depends on createX() details, and it really should not:
> 
>   {
>       LockingPointer lp(createX());
>   }
> 

In the context of how we agreed to shrink the library specific #if
wrapped parts down to specific action nuggets as much as possible. eg
how the session resume functons are defined:

I have been thinking the abstract API should use Pointer instead of Ptr
so the minimal #if part can do that library-specific initialization of
the Pointer and avoid constructing these things in generic code.
 If that works widely enough we can possibly get rid of the raw-ptr
constructor entirely for non-OpenSSL.

NP: trunk rev.14734 is a step along that path which demonstrates how it
could make things a fair bit simpler. With a caveat that I'm not sure
its going to be universally applicable, or just a best-effort idea.

Though slightly counter to that I'm also considering using that
constructor as the preferred way to initialize OpenSSL LockingPointer
objects so we can use the assignment operators (both copy and move) in
preference over the custom reset*() methods as a way to ease the
shared_ptr conversion as much as possible. You can see at least two
cases of that with X = std::move(Y) in these patches.

May be wishful thinking, but if all that works out we can have custom
LockingPointer exclusively for OpenSSL, and shared_ptr for GnuTLS (and
non-TLS builds). With the library agnostic code using only the
overlapping API between the two types. Which should be mostly; get(),
bool operators, and assignments.

> 
>> +    void resetWithoutLocking(T *t) {
>> +        unlock();
>> +        raw = t;
>>      }
>>  
>>      void resetAndLock(T *t) {
>> +        if (t != get()) {
>> +            resetWithoutLocking(t);
>> +            lock(t);
>> +        }
>> +    }
> 
> We should probably add self-reset protection to both cases or to none.
> 

Strange as it seems doing that would violate the non-locking guarantee
on the first one.

Both reset*() methods (and assignments) guarantee that the UnLocker is
called on any previously stored value. If that means the raw-pointer is
about to go invalid then its a caller bug for using that method of the
API. Because resetWithoutLocking() fully delegates lock control to the
caller this is not something we can make assumptions about in the
resetWithoutLocking() variant.

Doing a self-check to retain raw-pointer validity would behave the same
as if it had locked the pointer given. So even if we did the check we
would still have to leave the pointer as invalid at the end of it all.
Nasty situation, but thats part of what makes it a dangerous method in
the first place.

> 
>> +    void unlock() {
>> +        if (raw)
>> +            UnLocker(raw);
>> +        raw = nullptr;
>> +    }
> 
> Let's move assignment inside the if-statement to save a few CPU cycles.
> 

Okay.

> 
>> +    T *raw; ///< pointer to T object or nullptr
> 
> That description adds virtually no information. How about something like
> this: "Normally, no other code will have this raw pointer. However,
> OpenSSL may keep raw pointers and manage their reference counts. When
> OpenSSL shares a pointer with us, the caller must not use LockingPointer
> constructor and must call resetAndLock()."
> 

Gone with a slightly different tack there. Not so adverse to
constructor, but emphasising the issue and that care needs to be taken.

"
Normally, no other code will have this raw pointer.

However, OpenSSL does some strange and not always consistent things.
OpenSSL library may keep its own internal raw pointers and manage their
reference counts independently, or it may not. This varies between API
functions, though it is usually documented.

This means the caller code needs to be carefuly written to use the
correct reset method and avoid the raw-pointer constructor unless
OpenSSL function producing the pointer is clearly documented as
incrementing a lock for it.
"

> 
>> +    explicit LockingPointer(T *t = nullptr): raw(t) {}
> 
> I recommend calling resetWithoutLocking() here even though it could
> waste a few CPU cycles. This is the only dangerous part remaining in the
> LockingPointer interface, and it may help being explicit here.
> 

Okay. And added a comment about why is de-optimized.

> 
>>  typedef void* SessionPtr;
>> +CtoCpp1(xfree, SessionPtr);
> 
> xfree() is already a C++ function and does not need wrapping, right?
> 

Hmm. Yes, I'll give it a try.

> 
>>> Please also add a public clear(void) method that reset(t) will call
>>> instead of lock() when t is nil. Alternatively, add a reset(void) method
>>> with the same semantics as clear(void) -- that is what std::shared_ptr does.
>>
>> I do not see any clear() method in shared_ptr documentation.
>> <http://en.cppreference.com/mwiki/index.php?title=Special%3ASearch&search=clear%28%29>
> 
> Yes, I know. As I said, they provide a reset(void) instead. If you think
> we should add reset(void) instead of clear(), let's add that, but please
> do add one instead of [ab]using reset*(nullptr).
> 

Okay, going with reset(void) then to ease the shared_ptr transition as
much as possible.

> 
>> + * The lock() method increments Object's reference counter.
>> + *
>> + * The unlock() method decrements Object's reference counter and destroys
>> + * the object when the counter reaches zero.
> 
> Please remove or move/merge these descriptions with their corresponding
> private methods descriptions. No need to place this private info in the
> class description. Sorry I did not mention that earlier.
> 

Done. Though I've changed the second one to be:
"
Become a nil pointer. Decrements any pointed-to Object's reference
counter using UnLocker which ideally destroys the object when the
counter reaches zero.
"

Since UnLocker is externally provided there is no guarantee that the
destruction takes place. Decrement and Destruct is only an ideal design
for UnLocker implementation. eg. the non-TLS builds UnLocker will do
nothing and T will just be a void or int placeholder.

> 
>>> Please note that the originally proposed commit message no longer
>>> applies to your changes so we need a new one.
>>>
>>
>> "
>> This replaces TidyPointer with std::unique_ptr and re-implements the
>> part of TidyPointer which LockingPointer was using. Removing the
>> inheritence in the process and updating methods names for clarity.
> 
> No serious objections, but I suggest:
> 
> -----------
> Replaced TidyPointer with std::unique_ptr.
> 
> LockingPointer is now a stand-alone class that is understood (and
> documented) as a typical shared pointer with OpenSSL-friendly object
> importing methods. Replaced its TidyPointer::reset() abuse with an
> explicit resetWithoutLocking() method after reviewing that all such
> calls needed no locking indeed.
> -----------
> 

Sure.

> 
>> The LockingPointer still has some of the issues which led us down the
>> path to getting here. I've chosen to submit for review at this point to
>> ensure we still have a fully working Squid before going further down
>> this trail.
> 
> I do not recall any remaining in-scope LockingPointer issues except that
> the constructor does not force the caller to pick the right importing
> route. What issues did I forget about?
> 

The absence of locks for GnuTLS and the need to use LockingPointer
without actual locking is still a bit painful.

The later work to actually use std::shared_ptr instead of LockingPointer
should get rid of that though.

> 
> All of the above changes do not require another review round IMO.
> 

Okay. Build testing the final version underway and will merged to trunk
when thats done.

Amos