Here’s some code for lock free queues for a single producer and single consumer. The code is designed for Intel multiprocessors with strong memory model. I don’t know what ARM offers these days. But the strong memory model for x86 means that the program doesn’t need any special synchronizing operations at all. All it needs are a couple of volatile declarations to keep the compiler from caching values that are changed by the other process/thread.
There’s some use of #defines to make it easier to use static type checking but the core method is to keep data in an array (I also have a non-array based lock free linked list I may get organized) and have the producer increment a tail index and the consumer increment a head index. Increments are mod n where n is the number of elements in the array. The only complexity is full and empty conditions. When the array is empty head== tail but if the producer then fills the array using the last slot should roll the tail back to equal head again. One fix is to just never let the array fill up completely – reserve one element as a buffer. But that would be too easy. So I use one of the bits in the head and tail pointers to indicate condition. That bit it never used to calculate the index. When the producer fills the array it sets the bit in tail to be the complement of the value of that bit in head. When the consumer empties, it sets the bit in head to be the same as the value in tail.
h= h& OWQ_OFFBIT;
*i = ((OWQ_ELEMENT_T *)q->v)[h];
next = (h + 1) % q->z;
if(next == (t & OWQ_OFFBIT) ) q->h = t; //empty
else q->h = next;
The main code uses the high order bit, but I also have code using the low order bit and shifting so you can see the comparison.
There’s an example program called test_owq.c and the main code is a header file q.h