Implementing Read/Write Locks Portably

In multi-threaded programs you often find a usage pattern where many more threads need to read a shared resource than write to it, and where reading the resource would be thread-safe without mutexes if it is not being written.

( We’ll be making reference to an earlier article about multi-threading and mutexes. )

Say for example you have a hash table, that lets you look up strings via a string id:

class StringTable
{
   u32         addString( const char* str ) { return mTable.add( str ); }
   const char* getString( u32 id ) const    { return mTable.get( id ); }

   HashTable<u32,const char*> mTable;
}

Often for classes that are used in multi-threading it would be thread safe to use the const functions, but not the non-const ones. And that would be the case here.

Now imagine the usage pattern is that 1 in 1000 operations on StringTable are addString(), and the rest are getString(). All of your threads could be reading from this table simultaneously without mutexes, achieving perfect parallelism, except that 1 in 1000 times the table might be being written to.

If all you have are mutexes, you have to lock on both operations, and your shared StringTable objects become a lock contention bottleneck.

What if you could lock your mutex for reading and writing separately? Read locks would always succeed unless the object is locked for write. Write locks would work like regular mutex locks.

This type if synchronization object is called a read/write lock or a shared exclusive lock.

Implementing Read/Write Locks

In an earlier article I show how to write a portable mutex class. You can implement a read/write lock on top of the portable mutex, so the following code should be portable as well.

Lets look at the interface.

class KxRWMutex
{
public:
   KxRWMutex( u32 maxReaders = 64 ) : 
      mReadsBlocked( false ), mMaxReaders( maxReaders ), mReaders( 0 ) {}

   void readLock();
   void readUnlock();

   void writeLock();
   void writeUnlock();

private:
   void waitReaders();

   bool    mReadsBlocked;
   u32     mMaxReaders;
   u32     mReaders;
   KxMutex mReadMutex;
   KxMutex mWriteMutex;
};

I’m calling it a mutex, so that it’s class name matches the earlier class, but technically this isn’t a mutex. Mutexes guarantee that resources are locked for reading and writing. This only guarantees that there will be no other readers or writers whenever it is locked for writing.

Most of the functions are pretty self-explanatory. This object contains two mutexes, mReadMutex and mWriteMutex.

mWriteMutex will be locked during the entire time that the KxRWMutex is locked for writing. But mReadMutex is only locked during the calls to readLock() and readUnlock().

So the mWriteLock actually protects your code – but mReadMutex only protects the KxRWMutex‘s own counts of how many readers are currently accessing the your protected code. That is it keeps a count of the number of threads currently executing between calls to readLock() and readUnlock().

Lets look at the implementation of the read lock and unlock calls:

void KxRWMutex::readLock()    
{ 
   while ( 1 )
   {
      mReadMutex.lock();
      if (( !mReadsBlocked ) && ( mReaders < mMaxReaders ))
      {
         mReaders++; 
         mReadMutex.unlock();
         return;
      }
      mReadMutex.unlock();  
      kxSleep( 0 );
   }
   assert( 0 );
}

void KxRWMutex::readUnlock()  
{ 
   mReadMutex.lock(); 
   assert( mReaders );
   mReaders--; 
   mReadMutex.unlock(); 
}

readUnlock() is pretty easy. When you unlock it, it decrements the number of current readers.

readLock() is a little trickier because when you lock for reading you have to consider that the mutex may currently be locked for writing, and during that write lock, the lockWrite() method will be waiting for readers to finish. This lock also keeps track of a maximum number of readers. So it is possible that there may already be too many readers.

So readLock() waits in a loop for writeUnlock() to signal that it is ok to read again. And it waits also for the number of readers to get to a reasonable number again.

kxSleep() is a portable function that would cause the thread to sleep for a certain amount of time. If you sleep for 0 microseconds, you’re just telling the task scheduler to immediately switch to another thread. At the end of each loop you know there is no more work this thread can do until either another writer or reader finishes its work. It is similar to calling the POSIX yield() function.

Performance will generally be better if you yield() or sleep for zero time, than if you give an explicit delay. It is possible if your task scheduler is primitive that you could get better performance if you give it an actual delay.

Note that in each of these function we both lock and unlock the mReadMutex inside each call. That is your code is not protected from multiple readers – as designed.

Now lets look at the write functions:

void KxRWMutex::writeLock()
{
   mWriteMutex.lock(); 
   waitReaders();
}

void KxRWMutex::writeUnlock()
{
   mReadMutex.lock();
   mReadsBlocked = false;
   mReadMutex.unlock();
   mWriteMutex.unlock(); 
}

These are pretty simple. writeLock() will lock the write mutex and then wait for all the current readers to finish. writeUnlock() will signal that it is ok to start reading again and then unlock mWriteMutex(). We protect the write to mReadsBlocked from reading threads. We don’t have to protect it from write thread because there can only ever be one write thread – and we’re in it.

And finally the function that waits for all readers to finish:

void KxRWMutex::waitReaders()
{
   // block new readers 
   mReadMutex.lock();
   mReadsBlocked = true;
   mReadMutex.unlock();  

   // wait for current readers to finish
   while ( 1 )
   {
      mReadMutex.lock();
      if ( mReaders == 0 )
      {
         mReadMutex.unlock();  
         break;
      }
      mReadMutex.unlock();  
      kxSleep( 0 );
   }
   assert( mReaders == 0 );
}

First we signal that no more readers can start. Again protected from read threads. Next we wait in a loop for the count of readers to go to zero. As in readLock() we yield to the task scheduler at the bottom of each run through the wait loop.

You may be wondering why we limit the total number of readers. Why not let any number of readers through the read lock? Because we have to balance balance the amount of resources we give to reading and writing threads. The more readers you allow, the longer write locks will have to wait when you lock for writing.

This is something you may want to tune for your application. Generally the less often you are locking for write the larger the number of readers you should allow – unless you need the write locking to have very low latency. Then you may want to sacrifice from parallelism performance in order to reduce write lock latency.

Protecting a Shared Object with a Read/Write Lock

Now we close the loop, and use the read/write lock to protect our StringTable class:

class StringTable
{
   u32 addString( const char* str ) 
   { 
      mMutex.writeLock();
      u32 id = mTable.add( str ); 
      mMutex.writeUnlock();
      return id;
   }

   const char* getString( u32 id ) const    
   { 
      mMutex.readLock();
      const char* str = mTable.get( id ); 
      mMutex.readUnlock();
      return str;
   }

   HashTable<u32,const char*> mTable;
   KxRWMutex                  mMutex;
}

Resources:

Readers–writer lock Some theory behind read/write locks