A multi-connection AFD-based echo server
Last time
I looked at way of using \Device\Afd
to perform
individual socket polling for readiness. This differed from the
previous approach
to using \Device\Afd
, which batched up the sockets and issued
a single poll for multiple sockets.
The individual socket polling approach appeals to me as it would appear to scale more easily, and putting together an echo server that supports multiple connections is now much easier. It doesn’t map as well to the way other operating systems do things though, so if that’s your primary goal, then you’re probably better off continuing with the ‘set of sockets’ approach.
Full source can be found here on GitHub.
This article refers to the socket_without_device_afd code.
This isn’t production code, error handling is simply “panic and run away”.
This code is licensed with the MIT license.
Comparing the code in socket_without_device_afd/echo_client
and
the previous approach in socket/echo_client
it’s fairly obvious
that the new approach reduces the complexity somewhat. There’s
less code; though most of the differences are in the tcp_socket
class which operates slightly differently.
The polling we were doing before this worked with a single handle
to \Device\Afd
which was associated with an I/O Completion Port.
We only used the ‘per operation’ data, that is, what is returned
as a pointer to the ‘overlapped’ structure that we provided when we
made the polling call. This worked OK in our tests, but there were
some mistakes being made. Firstly, when polling we had to specify
a flag, Exclusive
, in the AFD_POLL_INFO
structure. We were
setting this to FALSE
and this allowed us to poll multiple
times for a single handle and each poll could be treated separately.
Unfortunately, we were polling using the same ‘operational data’
multiple times, expecting this to result in one poll being set up,
but instead it meant that multiple polls were being set up and
all of them could return when the required conditions were met.
The correct approach for the design we had, was to set the
Exclusive
flag to TRUE
. This change meant that we would get
a single response to a poll but that issuing another poll on a given
handle whilst a poll was already active, would result in the first
poll completing as canceled and then the new poll being registered.
Next, it makes more sense, when dealing with per-socket polling,
to associate the socket with the IOCP using the ‘completion key’
parameter. This is an opaque value that is associated with the handle
when you link the handle to the IOCP and which is then returned
from each completion for that handle. This is ‘per device’ data
and needn’t be convertible to an OVERLAPPED for correct operation.
Using this approach, we can associate the afd_events
interface on
the tcp_socket
class with the socket handle when we link it to
the IOCP and then call into this interface to handle the completions
when they are returned from GetQueuedCompletionStatus()
.
Then there’s the question of whether it’s more efficient, from a
system call perspective, to poll for each socket independently or
to group sockets together and poll using a handle to \Device\Afd
.
The independent polling approach requires a system call to poll
and a system call to retrieve the poll completion. This results
in two calls per poll whether we receive any events.
Theoretically, we could issue a poll, and then decide to poll for
more, or less, event types whilst that poll is active, we would
then have to deal with the cancellation completion for the first
poll before dealing with any events returned from the second. In
practice, we are likely to set up a poll once for a socket and only change the
poll as a result of processing a completion, and before we set up
the next poll. The ‘cost’ of cancelling an incomplete poll is unlikely
to occur. With polling for a set of sockets using a single handle,
we, at first, appear to have an advantage in working in batches
of sockets. We have one system call to set up a poll for many sockets
and one system call to retrieve completions for, potentially, many
sockets. In practice, we would need to cancel polls whenever we changed
the polling requirements for any socket in the set, which would, likely,
be every time we get a completion for any socket in the set. This
results in a poll per socket operation, which is the same as for the
independent polling. For completions, we can attempt to ameliorate
the system call cost of retrieving completions using GetQueuedCompletionStatusEx()
to retrieve a variable number of completions with each call…
Finally, if we enable SKIP_COMPLETION_PORT_ON_SUCCESS
processing for
the socket handle we can avoid any unnecessary system calls when issuing
a poll for a socket when a poll for the socket is already pending.
My gut feeling is that independent polling is no less efficient than polling for a set of handles together and is considerably easier to code for; at least for someone with a background in using ’normal’ completion-style IOCP socket designs.
In our independent polling echo server we have a listening socket which is
similar in design to the one in the previous example and an
echo_server_connection
which is a simple class wrapper around a
tcp_socket
which provides state for each socket connection.
Each time the listening socket accepts a new connection it creates an
instance of the echo_server_connection
object and links its socket
to the IOCP that it is using. We then issue a poll on the socket
for events and wait for a completion. Each connection can easily
be associated with the same IOCP and polled independently with no
need for any code to manage the set of all the connections as a
group.
Once we have a connection, we wait for a readability notification and then read data until we’ve either read all there is to read or we have filled our per connection buffer. We then echo this data back to the client by writing as much as we can. If we can’t write everything, we wait for the socket to become writeable and continue to read until our buffer is full.
This all works well, with one poll and one completion per group of events that we handle.
Full source can be found here on GitHub.
This article refers to the socket_without_device_afd code.
This isn’t production code, error handling is simply “panic and run away”.
This code is licensed with the MIT license.
Wrapping up
The main question now is “is there any point?” If we’re not using a design that
maps easily to the way other operating systems do this, then is there really any
advantage in using readiness polling rather than completion handling? I think
there may be. For one thing, dealing with TCP flow control using a standard
completion-based design can be complex, as can be seen here.
There’s also a possible performance gain for datagram sockets as we’re rarely
interested in send completions and would usually want to pull all data off the
wire in a tight loop and so a read readiness notification is likely more useful
to us than a series of read completions; even if we retrieve them as a batch,
using GetQueuedCompletionStatusEx()
. I won’t know any of this for sure until
I can compare performance but that can come next…
More on AFD
- Adventures with
\\Device\\Afd
- Test Driven Understanding
- Test Driven Design
- A simple client
- A simple server
- More \Device\Afd goodness
- Socket readiness without
\\Device\\Afd
- A multi-connection AFD-based echo server - this post