c++ - How does the omp ordered clause work?

Question

Welcome To Ask or Share your Answers For Others

c++ - How does the omp ordered clause work?

posted Oct 17, 2021 in Technique[技术] by 深蓝 (71.8m points)

c++ - How does the omp ordered clause work?

vector<int> v;

#pragma omp parallel for ordered schedule(dynamic, anyChunkSizeGreaterThan1)
    for (int i = 0; i < n; ++i){
            ...
            ...
            ...
#pragma omp ordered
            v.push_back(i);
    }

This fills v with an n sized ordered list.

When reaching the omp ordered block all threads need to wait for the lowest iteration possible thread to finish, but what if none of the threads was appointed that specific iteration? Or does the OpenMP runtime library always make sure that the lowest iteration is handled by some thread?

Also why is it suggested that ordered clause be used along with the dynamic schedule? Would static schedule affect performance?

Question&Answers:os

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Reply

深蓝 · Answer 1 · 2021-10-16T22:31:09+0000

The ordered clause works like this: different threads execute concurrently until they encounter the ordered region, which is then executed sequentially in the same order as it would get executed in a serial loop. This still allows for some degree of concurrency, especially if the code section outside the ordered region has substantial run time.

There is no particular reason to use dynamic schedule instead of static schedule with small chunk size. It all depends on the structure of the code. Since ordered introduces dependency between the threads, if used with schedule(static) with default chunk size, the second thread would have to wait for the first one to finish all iterations, then the third thread would have to wait for the second one to finish its iterations (and hence for the first one too), and so on. One could easily visualise it with 3 threads and 9 iterations (3 per thread):

tid  List of     Timeline
     iterations
0    0,1,2       ==o==o==o
1    3,4,5       ==.......o==o==o
2    6,7,8       ==..............o==o==o

= shows that the thread is executing code in parallel. o is when the thread is executing the ordered region. . is the thread being idle, waiting for its turn to execute the ordered region. With schedule(static,1) the following would happen:

tid  List of     Timeline
     iterations
0    0,3,6       ==o==o==o
1    1,4,7       ==.o==o==o
2    2,5,8       ==..o==o==o

I believe the difference in both cases is more than obvious. With schedule(dynamic) the pictures above would become more or less random as the list of iterations assigned to each thread is non-deterministic. It would also add an additional overhead. It is only useful if the amount of computation is different for each iteration and it takes much more time to do the computation than is the added overhead of using dynamic scheduling.

Don't worry about the lowest numbered iteration. It is usually handled to the first thread in the team to become ready to execute code.

Categories

c++ - How does the omp ordered clause work?

c++ - How does the omp ordered clause work?

Please log in or register to add a comment.

Please log in or register to reply this article.

1 Reply

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags