Bitten by 'scoped static initialisation' in C++ - turn on Visual Studio's C4640 warning!
Today I discovered that C++ scoped static initialisation (function level) in Visual Studio is not done in a thread safe manner. It’s the kind of thing that I should have already known but I guess I assumed that since namespace level static initialisation was safe so was function level static initialisation. Unfortunately it’s not. If you read the MSDN documentation in a particular way you could decide that the docs say that it’s not; but it’s imprecise and unclear and I doubt that I did read the MSDN documentation anyway. Looking at the generated code from Visual Studio 2013 clearly shows that there’s no thread safety involved.
** test byte ptr ds:[5C87D0h],1
jne JetByteTools::IO::CBufferChainStoresNulls::InternalAddToFront+77h (0473967h)
or dword ptr ds:[5C87D0h],1**
xor eax,eax
mov dword ptr [esp+1Ch],eax
push 2Dh
push 56DFE8h
mov ecx,5C87B4h
mov dword ptr ds:[5C87CCh],7
mov dword ptr ds:[005C87C8h],eax
mov word ptr ds:[005C87B8h],ax
call std::basic_string<wchar_t,std::char_traits<wchar_t>,std::allocator<wchar_t> >::assign (0404130h)
push 55C8B0h
call atexit (047E190h)
add esp,4
mov dword ptr [esp+1Ch],0FFFFFFFFh
void CBufferChainStoresNulls::InternalAddToFront(
mov edi,dword ptr [esi+8]
</wchar_t></wchar_t,std::char_traits<wchar_t>
The code in bold is the one time static initialisation test, check a value and if set jump past the initialisation, else set the value. Clearly multiple threads could pass through this check at the same time and both end up creating and setting the value to the static variable. I was aware of this but had dismissed it as a minor issue which would, at worst, result in a little extra work and an extra copy of an object which would be cleaned up at program exit. There’s a far worse problem though; a second thread could get to the test just after the first thread had set the ‘constructed’ flag but before it had actually completed constructing the variable. The second thread could then use the variable in an partially constructed state. Raymond Chen goes gives more detail on the issue here.
I’m not sure how I got into the habit of using this code pattern but I seem to have used it mostly as an attempt a micro optimisation where constant strings were required. I’m currently scanning my source code trees to remove all traces of it. There are very few situations where it might have been a valid choice and so far it’s easy to remove.
I’m also not really sure how I’d convinced myself that block scoped static constant variables were OK. The fact that I knew they were lazily constructed at first use and the fact that I knew that I was using them as a micro optimisation means that I must have known that there was no compiler injected synchronisation. I guess I just assumed that “magic happens here”… It’s one of those head slapping moments where the issue is SO obvious once you know about it. Anyway, I live and learn.
I guess I’ve been very lucky (or unlucky, you decide) up until now. Most of the scenarios where I’ve used this pattern have been where the object construction was obviously fast enough that I never had two threads in exactly the right places at exactly the right time. So it’s a standard latent race condition… It finally came to my attention because I was using the “micro optimisation” to create some strings in a function that was being called by multiple threads. My build machine’s unit tests were failing, rarely and strangely. I was getting an access violation in code that looked completely fine. Eventually I tracked the problem down by turning off compiler optimisation and generating a mini dump file when I converted the SEH exception for the access violation into a C++ exception… It then took a little while to work out exactly what the problem was.
Luckily I already compile with /Wall and with ‘warnings as errors’. Visual Studio’s Warning C4640 will locate the issue for me and cause my builds to fail until I fix it. All I need to do is remove the warning suppression #pragma that I have for that warning in my global warning suppression header file… Whilst I’m at it, I guess I should take a good look at the other warnings that are being suppressed…