[squid-dev] Care and feeding of ConnStateData

Thu Jul 7 22:16:01 UTC 2016

On 8/07/2016 7:22 a.m., Alex Rousskov wrote:
> On 07/06/2016 10:52 PM, Amos Jeffries wrote:
>> On 7/07/2016 10:24 a.m., Alex Rousskov wrote:
>>> Q1. What is ConnStateData (and related client_side.* code)?
>>>
>>>   C1. ConnStateData is the code shared among all Servers (BB1).
>>>   C2. ConnStateData ends where request routing code (BB2) begins.
> 
> 
>> Last time we went over this issue AFAIK we agreed that the above was to
>> be the defintion of ::Server.
> 
> Yes, it is a definition of the future Server class.
> 
> IIRC, last time we talked about it, the C2 rule specifically and BB2
> role in general have not been discussed. Now C2 became important for the
> right Downloader placement (and it has other nice side effects that I am
> not discussing here to stay focused).
> 
> 
>> Pieces of ConnStateData which are meeting that description should be
>> suffled up into Server.
> 
> I strongly disagree! Isolating and moving pieces of C1/C2 compliant code
> into a parent class not only results in a lot more work but also forces
> us to create temporary pure virtual functions to tie the two classes
> together and it tempts folks to inherit from the wrong class. It creates
> complex relationships that are not needed today and will then have to be
> dismantled later.
> 
> The correct approach is to move pieces that do _not_ satisfy the above
> rules out of ConnStateData and then eventually rename ConnStateData to
> Server. This approach creates no extra work and actually reduces
> complexity with virtually every step.
> 
> More on that further below where Q4 is discussed.
> 
> 
>>> Q2. Where does the pending Downloader class belong?
> 
>> In overview I think if we have a good Downloader design those other
>> things and ESIInclude should probably be converted into Downloader or
>> children of it.
> 
> Yes, but if and only if ESI needs to download small SBuf-storable
> things. If ESI needs potentially large HTTP response bodies, then ESI
> cannot be refactored to use Downloader.
> 
> 
> 
>>> Q3. Where does the future Http::Two::Server class belong?
>>>
>>> Today, that common parent for the two siblings is ConnStateData.
>>> Tomorrow, we may split and/or rename ConnStateData, of course, but
>>> Http::One::Server and Http::Two::Server classes are likely to remain
>>> siblings with a common base that contains the code they share.
> 
>> My draft branch uses ::Server as parent to Http2::Server. I've not yet
>> run across anything in ConnStateData that needed it to be a child of
>> that. Though there are things that need moving up to the Server parent
>> class.
> 
> The last two sentences seem to contradict each other, but this
> particular difference in opinion does not seem to be very important at
> the moment: If your Http2::Server works as a Server child, it can work
> as a ConnStateData child with very little extra effort. I do object to
> moving stuff from ConnStateData to Server, but that is a separate
> problem discussed below.
> 
> 
> 
>>> Q4. What to do with the existing src/servers/Server.h?
> 
>>> Today, that class is a nuisance:
> 
> 
>>> * It is not a common base for all servers (its stated purpose).
> 
>> Which servers is it not a base for exactly?
> 
> All of them! Http::One::Server and Ftp::Server base is ConnStateData,
> not the current class in src/servers/Server.h. And, if my answer to Q3
> is correct, Http::Two::Server base will be ConnStateData and not the
> current class in src/servers/Server.h either.
> 
> 
>> Note that as parent of ConnStateData for now anything which implements
>> ConnStateData implements Server.
> 
> This is not quite accurate, actually (children can hide parent
> interfaces), but it is irrelevant for the above bullet. ConnStateData
> kids do not know about and do not benefit from src/servers/Server.h
> existence or the ConnStateData split. The class in src/servers/Server.h
> is not their common base. ConnStateData is.
> 
> 
>>> * It does not fully encapsulate connection-managing code of its kids
>>> (its stated usage).
> 
>> ... those changes to make it so got blocked in audit.
> 
> Whether some future changes will or will not address this part of the
> problem is pretty much irrelevant to the problem existence (it only
> confirms that existence if you will).

Its pretty significant that you blocked a class being implemented to
match its design and now use the fact that its not happeed as a reason
against the class existing at all.

> 
>>> * It is essentially unused (it is not used as an API by others and has
>>> only one kid).
>>>
>>
>> ... the other child class go blocked in audit initially and still
>> waiting the major redesign to be completed before it can be re-suvmitted.
> 
> Similar to the above, I do not think it is a good idea to keep that
> currently broken class just because it might be needed in the future.
> Moreover, I would not be surprised if its wrongful existence already
> exerts negative influence on that future code!

Our disgareement is about which class of the two is "broken".

IMO a class lacking some of its needed API (::Server) is far less broken
than a class with a lot of differently scoped bits tacked on all over
the place (ConnStateData).

An incomplete class can easily be completed without braking the things
depending on the existing parts. Not so the 'kitcen sink' one.

> 
>> The two grandchildren in trunk we planned to make direct children of it
>> (yes we, all three of us at the time agreed). The intermediate kid being
>> the ConnStateData mess still *temporarily* in the middle of the hierarchy.
> 
> 
> *If* I agreed on moving Server code out of ConnStateData instead of
> moving non-Server code out of ConnStateData, then I made a major mistake!
> 
> What anybody did or did not say years ago is secondary to the current
> discussion,

It is the status-quo design basis on which the current discussion is
taking place. So yeah, it is relevant.

> but I know that in 2014 I said pretty much the same thing I
> am saying now[4]: "Keep ConnStateData as is for now, until somebody
> volunteers to extract HTTP code from it and place the extracted HTTP
> code into servers/HttpServer, moving the remaining generic code into
> src/servers/Server and src/http/".
> 

Yes, exactly. As the person who for all those years has been attemping
to *do* that work...

> In other words:
> 
>   1. Extract HTTP server-specific code from ConnStateData.
> 
>   ...once #1 is done, ConnStateData will only have generic code left...
> 

Assuming the extraction happens and audit does not block it. Which was
my point above.

I *have* been proposing exactly those changes and getting blocks thrown
up constantly preventing stuff being moved out of ConnStateData.

Usually on grounds that as a parial-measure its not good enough, like
ConnStateData as a whole (huge and messy as it is) must be retained and
perfected in a single step before anything is acceptible.

>   2. Move that remaining generic code into src/servers/Server and src/http/.
> 
> We all have been doing step #1 since then, but we are not yet ready for
> step #2 because ConnStateData still has HTTP server-specific code in it.

Apart from auditing all I'm seeing from Factory is more and more
non-Server things being added to ConnStateData. First it was 1xx message
handling, then ssl-bump, then all the FTP translation logics, then
Downloader (thanks for reversing on that one).

If not for those specific projects ConnStateData would be around half
its current size with a much simpler Server <->BB2 split.

> 
> When we are ready for step #2, the "move" will be exactly that -- we
> will be moving files and renaming classes to match their new location.
> There is no time in this plan where there are two Server classes, one
> called "Server" and one called "ConnStateData"!

I dont think you quite appreciate the current _location_ of the
ConnStateData logic.

The .h and .cc files are shared with ClientSocketContext and
ClientHttpRequest. Which are more or less all of BB2 between them.

Since a fairly big portion of the ConnStateData logic is also BB2 scope
stuff done in the wrong class it just gets a rename in-place during
stage #1.

So in order to get any given file ready for a rename/move. We either

a) shuffle _all_ the BB2 scoped stuff out of its currently locations
(regardless of whether its in ConnStateData or not). OR,

b) shuffle the (relatively smaller) pieces of ConnStateData out to
Server.* during step #1 and the rename/move is to change the file into
whatever the BB2 component name is.

Since the protocol agnostic parts are the *minority* of ConnStateData
logic in client_side* it is simpler in terms of file renaming to do (b)
han (a). That is where we are partially through with stage #1 now. I
don't see why you want to reverse direction completely and do the much
larger code shuffling just so you can avoid a temporary transitional state.

FYI: the very first thing I did way back was try to shuffle all three
classes out to different .cc files without rearranging the code
dependencies first. That is where I ran quickly into trouble with them
accessing each others internals and sharing static functions in the .cc

> 
> [4] http://www.squid-cache.org/mail-archive/squid-dev/201408/0145.html
> 
> 
>>> * It forces new code to add pure virtual methods that ConnStateData then
>>> overrides and implements, reminding me of the old Store class that
>>> slowly accumulated dozens of such methods because we needed to add
>>> something that one of the Store kids implemented.
> 
>> This is backwards.
> 
> Exactly! The ConnStateData split forces us to do things backwards.

No I meant your proposal is backwards. Moving a mountain instead of a
molehill.

> 
>> ConnStateData is the old class that has accumulated
>> all those extra bits.
> 
> Yes.
> 
> 
>> We are trying to remove that ConnStateData scope creep.
> 
> The correct way to remove code misplaced in ConnStateData is to move
> that misplaced code (rather than to split ConnStateData in two and then
> move everything else into the second class, leaving misplaced code in
> ConnStateData, as misplaced as it was when we started).
> 

What remains will be BB2.

> 
>> virtuals in Server are ideally implemented by the end-child
>> HTTP/FTP/whatever Server class. The ones that are implemented by
>> ConnStateData itself are in a temporary stage, until they can either be
>> pushed down to the childs, or the bits they depend on using from
>> ConnStateData and also moved up to Server and the virtual-ness dropped.
> 
> True. The ConnStateData split forces us to add temporary complexity that
> is completely unnecessary.
> 
> 
> 
>>> I recommend merging it back into ConnStateData until Squid actually
>>> needs it (or needs something like it). Merging it into ConnStateData
>>> would preserve all code improvements we have done after it was split. If
>>> somebody finds that useful, we can even keep its method definitions
>>> inside Server.cc, at least for a while.
> 
>> Doing that undermines 3 years work towards getting rid of ConnStateData.
> 
> I do not see how it undermines anything. To get rid of ConnStateData, we
> need to get rid of non-Server code in ConnStateData and then rename
> ConnStateData. Splitting ConnStateData in two does not help with that.
> It actually hurts!

You are calling to revert the changes that separate the Server logics
from the BB2+Server logics still in ConnStateData. The result of that
will be us spending an indefinite amount of years manually policing that
the one set of methods never call the other set of methods despite
sharing class scope, member variables etc.

> 
> I may be very wrong, but I sense that you might be attaching some
> special significance to the 13 letters "ConnStateData". It may be
> spelled "ConnStateData" but it reads "Server". We just did not want to
> rename that class while it is still bloated with non-Server code, that's
> all. I even said that explicitly[4]: "For example, when documenting
> something about ConnStateData inside FwdState, call it Server instead of
> ConnStateData. This will reduce future changes as the classes get renamed".
> 
> I certainly regret if something I said earlier was misinterpreted as the
> need to create a Server class that slowly accumulates code already
> properly placed in ConnStateData. I hope the above arguments prove that
> we must do the opposite. Pretty much everything in design and code
> refactoring is a matter of opinion, but this choice is as close to being
> "formally provable" as it gets.

You seem to have forgotten the reason why you agreed to it after (3?)
months of discussion last time around. I dont think that was ever
written down exactly so can't point at it, but I do remember you
agreeing to it early one morning for me.

I certainly regret that you think anything at all of the current logic
should be retained in ConnStateData. Even if it gets a name change.

Whichever way we go what the ::Server needs is:
1)
 a) read I/O handlers that pass off to
 b) a child-supplied Parser, and take the resulting Parser object to
construct
 c) an HttpMsg / HttpRequest and Http::Xaction. Adds that to
 d) a Pipeline set of transactions being done.

 NP: the *child* triggers BB2 to do its thing for each xaction in a
child-specific way.
 Also (1c) objects can be child-specific, which will help with FTP.

2)
 a) clientStreams? output API which
 b) Packs a message for output in a child-specific way, and calls a
 c) generic write I/O method, which sends the packed buffer blob and
shedules a child-provided Call on completion.

3) traffic accounting for (1) and (2). Possibly including use of
client_delay_pools, but child-specific assignment of what pools are applied.

Thats it. The 5 (HTTP/2 makes 6) protocols supported for client I/O do
not share anything else.

Some things to note about that list vs where we are today:

* the design of ::Server presenting generic I/O does not exist in
ConnStateData. Never has. What it has is logics for Http1::Server child
to read and write directly to the socket. That is making all the child
protocols except HTTP/1 more complicated than they should be.

* that ssl-bump, PROXY protocol, HTTP/1 vs h2 vs FTP, HTTP/1 1XX status
handling etc are all BB2 or Server-child specific activities. The child
duty in some places may be simplified down to passing an appropriate
Parser/Packer object for the ::Server to use via its Parser/Packable API
- but that is still child specific.

* all the above pieces except Pipeline [done-ish] and the (2a) API [not
yet discussed] have already been proposed, discussed and then blocked in
audit months or years ago.

* the remaining task to drop ConnStateData from existence is to
untangle/rename/move its protocol-specific logic out to the child
classes and BB2 components.
 We dont need to go near the existing Server vs ConnStateData separation
to do that shuffling. In fact ConnStateData is acting somewhat as a TODO
list of things yet to be shuffled.

Amos