Home » A C++ Mutex Operation for Intel-compatible Processors

A C++ Mutex Operation for Intel-compatible Processors

Interprocess synchronization has always been classified as a “slow” operation, but that is largely only because the Windows implementation requires switching to kernel mode (an expensive operation) each time you attempt to acquire a mutex. A much faster, but less general, mutex operation can be achieved on Intel-compatible processors. The implementation presented here requires only a single integer to store the mutex state:

#define ASM __asm#define asmcall __declspec(naked)#define AsmPrepare(x) (AsmPrepareImpl(&(x)))#define AsmAcquire(x) (AsmAcquireImpl(&(x)))#define AsmRelease(x) (AsmReleaseImpl(&(x)))/*** * A value of -1 means the mutex is free to be acquired * Any other value means it is taken ***/void AsmPrepareImpl(void * data){  ((uint *)data) = -1;}asmcall inline void AsmAcquireImpl(void * data){  ASM {  top:    MOV EAX, [ESP + 4] ; // load the address of 'data' into EAX    LOCK INC EAX ; // increment the DWORD at 'data'    JNZ @zop ; // ZF is set if the resulting value is 0 (it was -1 b4)    RET 4 ; // this is reached if ZF was not set, we now own the mutex  zop:    LOCK DEC EAX ; // decrement the DWORD at 'data'    PUSH 0 ; // the argument to sleep    CALL Sleep ; // take a very short nap before trying again    JMP @top ; // if at first we don't succeed ... try and try again  }}asmcall inline void AsmReleaseImpl(void * data){  ASM {    LOCK DEC [ESP + 4] ; // decrement the DWORD at 'data'    RET 4 ; // return  }}

This code uses the /Gz or add STDCALL to the specifications of the functions. While you can easily see that the above code does implement a stripped-down mutex, it has some shortcomings:

The mutex does not allow the owning thread to acquire it multiple times, it simply blocks if this is tried; i.e. it does not “count.”
The data variable must reside in the shared memory for synchronization between processes, otherwise it can only be used to perform synchronization between threads of the same process.
If acquiring the mutex fails, the thread does not actually sleep, it simply enters a loop in which it yields quickly to other threads; this can waste CPU time if many threads are left waiting for the same mutex.

Due to these shortcomings, the fast mutex implementation should only be used in cases where:

Contention for a mutex is rare?as in a lock on a database row.
Synchronization speed is really an issue and you can afford the time spent in writing code that does not depend on being able to acquire the same mutex multiple times?for many projects it is not a trivial task to convert existing code into this form.

It would not be fair to list the shortcomings of this mutex without also listing the advantages, so here they are:

Speed: Over 11 times faster than the Platform SDK on my machine.
Low memory overhead: Just a single 32-bit integer per mutex.
The single DWORD (integer) of memory to be used for a mutex can be allocated practially anywhere, including right in the middle of existing data structures, thus improving locality of reference and speed further.

About Our Editorial Process

At DevX, we’re dedicated to tech entrepreneurship. Our team closely follows industry shifts, new products, AI breakthroughs, technology trends, and funding announcements. Articles undergo thorough editing to ensure accuracy and clarity, reflecting DevX’s style and supporting entrepreneurs in the tech sphere.

See our full editorial policy.

A C++ Mutex Operation for Intel-compatible Processors

A C++ Mutex Operation for Intel-compatible Processors

About Our Editorial Process

About Our Journalist

Charlie Frank

Harris’s VP choice may shape climate agenda

Salesforce and Workday announce AI partnership

Pil partners with WaveBL for eBL digitization

Musk activates internet in Gaza hospital

Experts debate AI impact on cybersecurity

Palantir and C3.ai: high-potential AI stocks

Telefónica unveils new quantum security solution

Musk updates Tesla Roadster production timeline

Employees report AI increases their workload

Protect your online privacy with VPN

Amd announces Ryzen AI 9 HX 375

US faces hurdles to meet climate goals

Elon Musk’s xAI launches Memphis supercomputer

Switzerland mandates open-source software for government

Reddit blocks most search engines except Google

Monday sets record for hottest day

IBM stock rises on strong Q2 earnings

Wiz declines $23 billion offer from Alphabet

Military crackdown leaves 200 dead in Bangladesh

Elon Musk attends Netanyahu’s address to Congress

Ai-powered GR Supras complete tandem drift

Mega-cap tech stocks under pressure

New IBM cybersecurity certificate at community colleges

Eviden unveils Qaptiva™ quantum emulator for researchers

Telefónica Tech secures global BBVA cybersecurity deal