Practical Testing: 33 - Intrusive multi-map.

Page content

Previously on “Practical Testing”… I’m in the process of replacing STL containers with custom intrusive containers in the timer system that I have been developing in this series of articles. The idea is that the intrusive containers do not require memory operations for insertion or deletion as the book-keeping data required to store the data in the container has been added to the data explicitly. This reduces the potential contention between threads in the application and, hopefully, improves overall performance.

In the last instalment I showed how my usage of std::set in the timer wheel code could be replaced by an intrusive set. This time I’ll also replace the std::set and std::map from the timer queue. These are quite major internal changes but since we have a large set of unit tests it’s reasonably easy to make these changes without having to worry about breaking the external interface.

The timer wheel’s STL usage is pretty simple. It has a single set of handles that it uses to validate active handles and to clean up active timers when the timer wheel is destroyed. The timer queue is a little more complex. It has the same active handle collection as the timer wheel but in addition to that it also has the timer queue itself which is represented as a multi-map of timers stored by absolute timeout. What’s more, since we need to be able to extract all of the timers for a given timeout in one operation we have a home brewed multi-map implementation that uses an std::map that contains a std::deque of timers for a given timeout value.

Storing data in multiple containers

Since the TimerData for the timer queue needs to be stored in two intrusive collections at the same time, the set and the multi-map, we need two separate nodes. In the timer wheel we only stored the data in the set and so simply inherited the node and defaulted the access function object. With the timer queue we do the same thing for the node required for the intrusive set and then add a data member for the node required for the multi-map. This then requires that we add an explicit function object for accessing the multi-map’s node. We need a second explicit function object for accessing the key. The resulting definition of the multi-map for the timer queue looks like this:

      class TimerDataIntrusiveMultiMapNodeKeyAccessor
      {
         public :

            static ULONGLONG GetKeyFromT(
               const TimerData *pNode);
      };

      class TimerDataIntrusiveMultiMapNodeAccessor
      {
         public :

            static CIntrusiveMultiMapNode * GetNodeFromT(
               const TimerData *pData);

            static TimerData *GetTFromNode(
               const CIntrusiveMultiMapNode *pNode);

            static TimerData *GetTFromNode(
               const CIntrusiveRedBlackTreeNode *pNode);
      };

      typedef TIntrusiveMultiMap<
         TimerData,
         ULONGLONG,
         TimerDataIntrusiveMultiMapNodeKeyAccessor,
         std::less<ULONGLONG>,
         TimerDataIntrusiveMultiMapNodeAccessor> TimerQueue;

The implementation of the function objects is fairly straight forward, see the source code for details.

With the intrusive multi-map added the code that uses it can be fixed up. This results in considerable change as the new multi-map is designed with the required use cases in mind and so is simpler to use than the previous combination of std::map and std::deque. The standard std::map code is pretty much unchanged, except for the change to my house style, but the multi-map usage is considerably more straight forward as the new map presents an interface which allows for the removal of all data with a specific key in one go.

Performance changes

These changes considerably improve the performance of the timer queue’s operations to set, cancel and process timers. Calls to SetTimer() are between 50-60% faster and calls to handle timeouts are around 30% faster. These tests don’t show potential the thread contention improvements though. The changes remove all memory allocation and release from the calls to set, cancel and process timers and so move the potential for contention from C(tq)+(C(ts-tq+1)) to C(tq) (see here for what this means). In other words the only threads that the timer queue will be contending with are other threads that use the timer queue rather than any thread that uses the same heap that the STL containers use.

Source code

The code is here and the new rules apply. This code is in the Public Domain.