Lock-Free Multi-Producer Multi-Consumer Queue on Ring Buffer

Lock-Free Multi-Producer Multi-Consumer Queue

Hopefully, you do not have to ask the kernel for help with user-space thread
synchronization. The CPU (at least in the most common architectures)
provides atomic
memory operations and barriers. With the operations, you can atomically:

Read the memory operand, modify it and write it back.

Read the memory operand, compare it with a value and swap it with the other
value.

Memory barriers are special assembly instructions also known as fences.
Fences guarantee an instruction's execution order on the local CPU and visibility
order on other CPUs. Let's consider two independent data instructions, A
and B, separated by fence (let's use mfence, which provides a guarantee for
ordering read and write operations):

A
mfence
B

The fence guarantees that:

Compiler optimizations won't move A after the fence or B before the
fence.

The CPU will execute A and B instructions in order (it normally
executes instructions out of order).

Other CPU cores and processor packages, which work on the same bus, will
see the result of instruction A before the result of instruction B.

For our queue, we need to synchronize multiple threads' access to the head_ and
tail_ fields. Actually, when you run head_++ (this
is an example of an RMW,
Read-Modify-Write, operation because the processor must read the current head_ value,
increment it locally and write it back to memory) on two cores, both
cores
could read the current head_ value simultaneously, increment it and
write the new value back simultaneously, so one increment is lost. For atomic operations,
C++11 provides the std::atomic template, which should replace the current GCC sync_
intrinsics in the future. Unfortunately, for my compiler (GCC 4.6.3 for
x86-64),
std::atomic<> methods still generate extra fences independently on specified memory
order. So, I'll the use old GCC's intrinsics for atomic operations.

We can atomically read and increment a variable (for example, our head_)
by:

__sync_fetch_and_add(&head_, 1);

This makes the CPU lock the shared memory location on which it's going to do an
operation (increment, in our case). In a multiprocessor environment, processors
communicate to each other to ensure that they all see the relevant data. This is
known as the cache coherency protocol. By this protocol, a processor can take
exclusive ownership on a memory location. However, these communications are not
for free, and we should use such atomic operations carefully and only when
we really need them. Otherwise, we can hurt performance significantly.

Meanwhile, plain read and write operations on memory locations execute
atomically and do not require any additional actions (like specifying the
lock
prefix to make the instruction run atomically on x86 architecture).

In our lock-free implementation, we're going to abandon the mutex mtx_ and
consequently both the condition variables. However, we still need to wait if the
queue is full on push and if the queue is empty on pop operations. For
push, we
would do this with a simple loop like we did for the locked queue:

while (tail_ + Q_SIZE < head_)
sched_yield();

sched_yield() just lets the other thread run on the current processor. This is
the native way and the fastest way to re-schedule the current thread. However, if there is no
other thread waiting in the scheduler run queue for available CPU,
the current thread will be scheduled back immediately. Thus, we'll always see 100%
CPU usage, even if we have no data to process. To cope with this, we can use
usleep(3) with some small value.

Let's look more closely at what's going on in the loop. First, we read the tail_
value;
next we read the value of head_, and after that, we make the decision whether to wait
or push an element and move head_ forward. The current thread can schedule at any
place during the check and even after the check. Let's consider the two-thread
scenario (Table 1).

Alexander Krizhanovsky is the software architect and founder of NatSys-Lab. Before NatSys-Lab, he worked as a Senior Software Developer at IBM, Yandex and Parallels. He specializes in high-performance solutions for UNIX environments.

Today, in addition to the bright colors and unusual shapes, persistent resistance to organic pollutants, capture the image of fashion eyewear comprehensive definition than ever before.Oakley sunglasses for the Asian sports and special living area can be seen as timeless and stunning sunglasses.Sports sunglasses are generally polarized lenses can eliminate interference ability to increase depending on the material, and more lightweight frame materials and styles designed for sports wear.The colors of Oakley sunglasses to protect your eyes, the sun goes down, Oakley sunglasses Lenses fading light will automatically adjust.As time goes by, sunglasses, and gradually become a common part of everyday life, fashion jewelry.Darker skin should choose brighter colors; fair-skinned, generally with what color glasses are very nice.Yellow lenses filter blue light, nature scenery can be seen more clearly. While driving, wearing a yellow lens sunglasses, you can more clearly see from the vehicle.In visible light, violet energy, the red energy minimization. UV contains a lot of natural light, due to the high energy of the ultraviolet rays, so interested in corneal and retinal damage, and therefore the sunglasses would expect prevent ultraviolet.On sunglasses to be safe, or else you exercise, it will follow the action, which you found in the running belt loose the same reaction.Do not be flashy there may be some fashion rimmed glasses and confusion you feel, you must wear a framework to see if this is the most important.Oakley sunglasses, inventions and innovative ideas, world-class scientific research and production equipment, provide a strong support.Polarized lenses is recognized worldwide as the most suitable for driving the lens. The light reflected by the surface polarization glare. The negative effects of glare - Enhanced brightness, color saturation weakened; object contours become blurred, glasses fatigue and discomfort.Fashion sport sunglasses PLUTONITE lens manufacturing materials, polarized Oakley sunglasses block 100% of harmful UV blue light, vision than the United States defined in ANSI industrial standards.Oakley in addition to the characteristics of the population unobtanium their ears grips to prevent slipping and High Definition Optics (HDO), it provides a clear, but also to protect the eyes.Oakley sunglasses at least change the color of the object itself, more real, the advantage of the natural landscape, but also to prevent glare.In accordance with the functional use can be divided into many types of sunglasses point. Ordinary sunglasses, decorated sunglasses, drivers sunglasses.Oakley sunglasses are very useful activities, such as driving, fishing and other sports. The polarized lenses bring better visual clarity.Date night, a copy of jewelry features a lot of fashion critics alike, affordable, because these are a good substitute to platinum, gold, and other jewelry sunglasses.Most girls will be ignored is not a compliment, this is mainly because they like to be called a fashionable and stylish than most people know.Oakley sunglasses is the most appropriate choice! Oakley glasses store, especially the discount may be your choice. To select your favorite, you will not regret it!

Aside: I believe I am the registered user rhkramer (at least, I used to be) but this posting thingie wouldn't let me use that name--I think if someone at LJ looks it up, they'll see that my email and the email of registered user rhkramer are the same.

Do you really mean temporal variable or do you just mean a temporary variable.

I had never heard of a temporal variable before I read this article, then I did some googling to find it.

In looking at a page of 10 google hits, I then investigated 3 or 4 of those. At least one of them definitely simply meant temporary variable, and, at the time I looked at the article, it used the phrase temporary variable. I'm guessing that at the time google indexed the article it might have said temporal variable--but there were no remaining instances of temporal in the article. OTOH, maybe google decided that I meant temporary and used temporary instead of temporal in the query.

One hit on the page of hits did give me some hints as to what might be meant by a temporal variable:

'
On the semantics of (Bi)temporal variable databases - Springer
link.springer.com/chapter/10.1007%2F3-540-57818-8_53
Numerous proposals for extending the relational data model to incorporate the
temporal dimension of data have appeared during the past several years.
'

I guess my point is (especially as an old guy trying to keep up with some of this stuff), is that it sure would help if terminology didn't change unnecessarily. If the variable in this article truly is something more or different than a temporary variable, fine (but then please provide a definition or a pointer to a definition), but, if it is no different, then just please use "temporary variable".

I do consider all of the concepts you have presented for your post.
They are really convincing and can definitely work. Nonetheless, the posts are very short for beginners.
Could you please lengthen them a bit from subsequent time?

This is untrue (and is explicitly contradicted by the Intel manuals, which state categorically that “mfence does not serialize the instruction stream”; i.e. the instructions can still execute out of order).

The mfence will cause memory accesses before the fence to complete before memory accesses after the fence (more accurately, it causes memory accesses before the fence to become globally visible — i.e. their effects are apparent to other cores in the system — before those after the fence).

This was just a simplification for gentle introduction to memory ordering and when and why barriers are used. Unfortunately, the article has limited size, so there is no opportunity to carefully and fully describe this and some other interesting points.

I'm really inspired with your writing talents and also with the format to your blog. Is this a paid theme or did you customize it yourself? Anyway keep up the excellent high quality writing, it's rare to see a great weblog like
this one nowadays..

If effects of the instruction A and B are to be visible outside the processor core they must somehow access the memory (or to be precise maybe the cache). The article explains inter core or inter processors relations, so IMHO the explanation in the article is a little simplification but it is not inaccurate.