Forumshttps://software.intel.com/en-us/view/forum-page-default/36940
enIntel(R) TBB 2018 released!https://software.intel.com/en-us/forums/intel-threading-building-blocks/topic/684925
<div class="field field-name-body field-type-text-with-summary field-label-hidden"><div class="field-items"><div class="field-item even" property="content:encoded"><ul>
<li><a href="https://software.intel.com/en-us/articles/intel-threading-building-blocks-release-notes#2018">What's new</a> in Intel® TBB 2018 and release notes.</li>
<li>Intel® TBB commercial version: <a href="https://software.intel.com/en-us/intel-tbb">https://software.intel.com/en-us/intel-tbb</a> </li>
<li>Intel® TBB open source version: <a href="https://www.threadingbuildingblocks.org/" rel="nofollow">https://www.threadingbuildingblocks.org/</a></li>
<li>Updated and improved Intel® TBB 2018 <a href="https://software.intel.com/en-us/tbb-tutorial">Tutorial</a> and <a href="https://software.intel.com/en-us/tbb-documentation">General documentation.</a></li>
</ul>
</div></div></div>Thu, 08 Sep 2016 12:51:14 +0000Alexey M. (Intel)684925 at https://software.intel.comSignal processing filtershttps://software.intel.com/en-us/forums/intel-threading-building-blocks/topic/755345
<div class="field field-name-body field-type-text-with-summary field-label-hidden"><div class="field-items"><div class="field-item even" property="content:encoded"><p>Hi,</p>
<p>I need to write a signal processing software with TBB. I am trying to use tbb::flow for that. I have:</p>
<p>- One producer P0</p>
<p>- Two consumers C0, C1</p>
<p>In order to communicate, the producer writes chunks of data to a std::vector&lt;float&gt; called pool. The size of this vector is nb_chunks * chunk_size. To get an idea, chunk_size is about 100 and nb_chunks is about 10.</p>
<p>When the producer first receives a signal from the acquisition device, it writes 100 element to the pool and sends a signal to the 2 consumers C0 and C1, so that it can read the data. The producer might receive another signal from the acquisition device, write 100 elements to the next memory chunk and sends a signal to the consumers C0 and C1. When they are done with the data, C0 and C1 needs to send a signal back to P0 telling it that the chunk of data is available for the producer to write to it.</p>
<p>As a consequence, P0 is connected to C0 and C0 is connected to P0. When P0 receives the message from C0 that a buffer chunk is released, it usually won't fill it right away and ship it back to the consumers. As a consequence, I don't want him to send it a signal right away. It seems to me that it does not fit with the way TBB is designed. </p>
<p>So I don't know how to design this workflow. Any help would be appreciated.</p>
</div></div></div>Fri, 02 Feb 2018 11:34:17 +0000velvia755345 at https://software.intel.comPerformance bottlenecks on high end server https://software.intel.com/en-us/forums/intel-threading-building-blocks/topic/755336
<div class="field field-name-body field-type-text-with-summary field-label-hidden"><div class="field-items"><div class="field-item even" property="content:encoded"><p>I am using internet tbb threads and intel parallelism in my application for performance optimization, which works pretty well on my development machine with avg 10% CPU load. But when I am deploying the application on target machine which is a high end server with 32 cores and huge RAM, it is utilizing 80-90 % of CPU in same scenario . The server has a custom built hardware with rhel 6.<br />
What could be the root cause of this problem?</p>
<p> </p>
</div></div></div>Fri, 02 Feb 2018 04:31:29 +0000SATI, RAJAT755336 at https://software.intel.comUpdate on processor vulnerabilities (Spectre/Meltdown)https://software.intel.com/en-us/forums/intel-threading-building-blocks/topic/755300
<div class="field field-name-body field-type-text-with-summary field-label-hidden"><div class="field-items"><div class="field-item even" property="content:encoded"><p>Is there potential vulnerabilities that are exposed in intelTBB library in regard to Spectre &amp; Meldown processor vulnerabilties?</p>
<p>If yes, is there an update scheduled in response to these vulnerabilties?</p>
<p>Thanks</p>
</div></div></div>Thu, 01 Feb 2018 06:46:53 +0000Place, David755300 at https://software.intel.comBlocking (waiting) version of try_put()?https://software.intel.com/en-us/forums/intel-threading-building-blocks/topic/755268
<div class="field field-name-body field-type-text-with-summary field-label-hidden"><div class="field-items"><div class="field-item even" property="content:encoded"><p>Hello, and thank you all for TBB; it is a wonderful library. I am already using "parallel_do" to great effect in one application.</p>
<p>Now I am considering how to map a different application onto the TBB model, and flow graphs appear to be very close what I want.</p>
<p>However, my "messages" are large, so I need to limit how many of them are in flight at a time. I have read the "Flow Graph Tips for Limiting Resource Consumption" chapter -- in particular, the "Create a Token-Based System" section -- but the pull model it requires is very unnatural in my case.</p>
<p>The problem is that my message source really wants a push model. The actual generation of messages is deep inside a nested set of member function calls on various objects. Capturing all of that state in some kind of closure/continuation so that I can spit out the next message on-demand would require a major restructuring (and, in my opinion, uglification) of the code.</p>
<p>Is there a variant of try_put() that simply waits until there is room for the next message? Of course I do not want the underlying thread to block; I want it to go off processing other tasks like wait_for_all() etc.</p>
<p>Or is there some straightforward way to achieve a similar effect (e.g. a simple semaphore primitive that knows how to process other tasks while waiting)?</p>
<p>Thanks!</p>
</div></div></div>Wed, 31 Jan 2018 17:38:08 +0000Patrick Lopresti755268 at https://software.intel.comCompilation of atomic reads into 3 identical loadshttps://software.intel.com/en-us/forums/intel-threading-building-blocks/topic/755254
<div class="field field-name-body field-type-text-with-summary field-label-hidden"><div class="field-items"><div class="field-item even" property="content:encoded"><p>Hello,</p>
<p>this is more out of curiosity than anything else. When looking at the generated assembly code for a tight loop that polls an atomic state member until it has a certain value, I see that the read of the atomic variable is translated by the compiler (gcc 5.2.0 x64) into 3 identical loads (as shown by the assmbly view in vTune). So:</p>
<p>while (m_state == TS_BUSY_WAITING) { ASM_PAUSE; } </p>
<p>turns into</p>
<p>Block 7: <br />
pause <br />
movl 0x3c(%rdi), %eax <br />
movl 0x3c(%rdi), %eax <br />
movl 0x3c(%rdi), %eax <br />
cmp $0x3, %eax <br />
jz 0x1b5af88 &lt;Block 7&gt; </p>
<p>Notice the 3 identical movl operations.</p>
<p> </p>
<p>What is the cause behind this translation? I see a similar translation also in other places where tbb::atomic is being used.</p>
<p> </p>
<p> </p>
</div></div></div>Wed, 31 Jan 2018 08:52:57 +0000Stephan T.755254 at https://software.intel.comDetermine number of workers in currently active task schedulerhttps://software.intel.com/en-us/forums/intel-threading-building-blocks/topic/755238
<div class="field field-name-body field-type-text-with-summary field-label-hidden"><div class="field-items"><div class="field-item even" property="content:encoded"><p>I'm having trouble finding a way in TBB to get the current number of workers in the active task scheduler. For a contrived example:</p>
<blockquote><p>void g() {</p>
<p> // how can I query from TBB here to get the maximum allowed number of workers that was set in f()?</p>
<p>}</p>
<p>void f(int num_workers) {</p>
<p> tbb::task_scheduler_init init(num_workers);</p>
<p> g();</p>
<p>}</p>
</blockquote>
</div></div></div>Tue, 30 Jan 2018 21:22:50 +0000e4lam755238 at https://software.intel.comparallel_invokehttps://software.intel.com/en-us/forums/intel-threading-building-blocks/topic/755151
<div class="field field-name-body field-type-text-with-summary field-label-hidden"><div class="field-items"><div class="field-item even" property="content:encoded"><p>Hi,</p>
<p>I have tried to use tbb::parallel_invoke to parallelize two functions. I have the next functins:</p>
<pre class="brush:cpp;">bool isValidDate(string order)
int checkOrder(string order)</pre><p>isValidDate check that the order date is correct and checkOrder check the format of the order. </p>
<p>Actually, I execute these functions sequentially, firstly I check the date and then the format. Now, I want to execute both functions in parallel, for this, I use parallel_invoke as follows:</p>
<pre class="brush:cpp;">tbb::parallel_invoke (
[&amp;]{isValidDate(order);}, [&amp;]{checkOrder(order);} );</pre><p>This code execute correctly, but is slower than the secquential code. I don´t urdenstand why this is so.</p>
<p>Anybody could say me the reason or some way to parallelize the functions well?</p>
<p>Thank you so much!</p>
</div></div></div>Fri, 26 Jan 2018 23:37:35 +0000pascual, carlos755151 at https://software.intel.comparallelize vector array with intelTBBhttps://software.intel.com/en-us/forums/intel-threading-building-blocks/topic/754920
<div class="field field-name-body field-type-text-with-summary field-label-hidden"><div class="field-items"><div class="field-item even" property="content:encoded"><p>Hi, I´m new in IntelTBB and I am trying to parallelize an array of vector. This is the function, in which I pass the array as a pointer.</p>
<pre class="brush:cpp;">void calculate(map&lt;int,string&gt; &amp;m, vector&lt;order&gt;* ord, auto&amp;k, string &amp;dir){
for(int i=0;i&lt;k;i++){
//here I declare a series of variables
for(int j=0; j&lt;ord[i][j].size()-1; j++){
// here I do a series of calculations with the values of
// the array and I generate one file for each row of the array
// with this calculations, to do this I declare more variables
}
}
}</pre><p>I want to parallelize this function so that it can process several rows of the array at the same time. I have tried to do this with parallel_for as follows:</p>
<pre class="brush:cpp;">void calculate(map&lt;int,string&gt; &amp;m, vector&lt;order&gt;* ord, auto&amp;k, string &amp;dir){
tbb::parallel_for(tbb::blocked_range&lt;int&gt;(0,k),[&amp;](tbb::blocked_range&lt;int&gt; r)
{
for(int i=r.begin(); ;i&lt;r.end();i++){
//here I declare a series of variables
for(int j=0; j&lt;ord[i][j].size()-1; j++){
// here I do a series of calculations with the values of
// the array and I generate one file for each row of the array
// with this calculations, to do this I declare more variables
}
}
});
}</pre><p>This program compile well, but when it is executed, sometimes I obtain one result (generate text file that I need) with wrong calculations, but normally I obtain an error like this: </p>
<pre class="brush:cpp;">TBB Warning : Exact exception propagation is requested by application but the linked library i
is built without support for it
terminate called after throwing an instance of 'tbb::capture_exception'</pre><p>I don´t know if this formis the correct form to do it. Can anybody help me?.</p>
<p>Thank you!</p>
</div></div></div>Wed, 24 Jan 2018 20:53:14 +0000pascual, carlos754920 at https://software.intel.comIntel TBB segfaults after updating Ubuntu kernel to 4.13https://software.intel.com/en-us/forums/intel-threading-building-blocks/topic/754805
<div class="field field-name-body field-type-text-with-summary field-label-hidden"><div class="field-items"><div class="field-item even" property="content:encoded"><p>TBB version: 2018 initial release</p>
<p>I am using OpenCV library compiled with TBB threading framework though python interface. After updating Kubuntu 16.04 kernel from 4.10 to 4.13 version, my code stopped working with segfault right in "import cv2" line. I tried to recompile OpenCV with TBB having new kernel ending up with the same problem. After recompiling with OpenMP instead of TBB problem disappears. It seems that 2018 Update 2 has the same problem.</p>
<p>Interestingly when trying to simply start python interpreter "import cv2" and perform same operations as in code in "live" mode it has no problems.</p>
<p>Backtrace:</p>
<pre class="brush:plain;">[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Program received signal SIGSEGV, Segmentation fault.
0x00007ffff4155659 in std::rethrow_exception(std::__exception_ptr::exception_ptr) () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
(gdb) bt
#0 0x00007ffff4155659 in std::rethrow_exception(std::__exception_ptr::exception_ptr) () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#1 0x00007fff9ea3cc38 in tbb::internal::gcc_rethrow_exception_broken () at ../../src/tbb/tbb_misc.cpp:179
#2 0x00007fff9ea3f5cb in tbb::internal::governor::acquire_resources () at ../../src/tbb/governor.cpp:80
#3 0x00007fff9ea4bcc0 in tbb::internal::__TBB_InitOnce::__TBB_InitOnce (this=&lt;optimized out&gt;) at ../../src/tbb/tbb_main.h:71
#4 __sti___ZN48_INTERNAL_26_______src_tbb_tbb_main_cpp_ca9dcbe33tbb8internal28__TBB_InitOnceHiddenInstanceE () at ../../src/tbb/tbb_main.cpp:71
#5 __sti__$E () at ../../src/tbb/tbb_main.cpp:52
#6 0x00007ffff7de76ba in call_init (l=&lt;optimized out&gt;, argc=argc@entry=11, argv=argv@entry=0x7fffffffdb08, env=env@entry=0xf2b750) at dl-init.c:72
#7 0x00007ffff7de77cb in call_init (env=0xf2b750, argv=0x7fffffffdb08, argc=11, l=&lt;optimized out&gt;) at dl-init.c:30
#8 _dl_init (main_map=main_map@entry=0x1206550, argc=11, argv=0x7fffffffdb08, env=0xf2b750) at dl-init.c:120
#9 0x00007ffff7dec8e2 in dl_open_worker (a=a@entry=0x7fffffffa810) at dl-open.c:575
#10 0x00007ffff7de7564 in _dl_catch_error (objname=objname@entry=0x7fffffffa800, errstring=errstring@entry=0x7fffffffa808, mallocedp=mallocedp@entry=0x7fffffffa7ff, operate=operate@entry=0x7ffff7dec4d0 &lt;dl_open_worker&gt;,
args=args@entry=0x7fffffffa810) at dl-error.c:187
#11 0x00007ffff7debda9 in _dl_open (file=0x7fffa27178a0 "/usr/local/lib/python3.5/dist-packages/cv2.cpython-35m-x86_64-linux-gnu.so", mode=-2147483646, caller_dlopen=0x60b35a &lt;_PyImport_FindSharedFuncptr+138&gt;, nsid=-2,
argc=&lt;optimized out&gt;, argv=&lt;optimized out&gt;, env=0xf2b750) at dl-open.c:660
#12 0x00007ffff75ecf09 in dlopen_doit (a=a@entry=0x7fffffffaa40) at dlopen.c:66
#13 0x00007ffff7de7564 in _dl_catch_error (objname=0xbef2d0, errstring=0xbef2d8, mallocedp=0xbef2c8, operate=0x7ffff75eceb0 &lt;dlopen_doit&gt;, args=0x7fffffffaa40) at dl-error.c:187
#14 0x00007ffff75ed571 in _dlerror_run (operate=operate@entry=0x7ffff75eceb0 &lt;dlopen_doit&gt;, args=args@entry=0x7fffffffaa40) at dlerror.c:163
#15 0x00007ffff75ecfa1 in __dlopen (file=&lt;optimized out&gt;, mode=&lt;optimized out&gt;) at dlopen.c:87
#16 0x000000000060b35a in _PyImport_FindSharedFuncptr ()
#17 0x000000000061000b in _PyImport_LoadDynamicModuleWithSpec ()
#18 0x0000000000610538 in ?? ()
#19 0x00000000004e9c36 in PyCFunction_Call ()
#20 0x000000000053dbbb in PyEval_EvalFrameEx ()
#21 0x0000000000540199 in ?? ()
#22 0x000000000053c1d0 in PyEval_EvalFrameEx ()
#23 0x000000000053b7e4 in PyEval_EvalFrameEx ()
#24 0x000000000053b7e4 in PyEval_EvalFrameEx ()
#25 0x000000000053b7e4 in PyEval_EvalFrameEx ()
#26 0x000000000053b7e4 in PyEval_EvalFrameEx ()
#27 0x0000000000540f9b in PyEval_EvalCodeEx ()
#28 0x00000000004ebd23 in ?? ()
#29 0x00000000005c1797 in PyObject_Call ()
#30 0x00000000005c257a in _PyObject_CallMethodIdObjArgs ()
#31 0x00000000005260c8 in PyImport_ImportModuleLevelObject ()
#32 0x0000000000549e78 in ?? ()
#33 0x00000000004e9ba7 in PyCFunction_Call ()
#34 0x00000000005c1797 in PyObject_Call ()
#35 0x0000000000534d90 in PyEval_CallObjectWithKeywords ()
#36 0x000000000053a1c7 in PyEval_EvalFrameEx ()
#37 0x0000000000540199 in ?? ()
#38 0x0000000000540e4f in PyEval_EvalCode ()
#39 0x000000000054a6b8 in ?? ()
#40 0x00000000004e9c36 in PyCFunction_Call ()
#41 0x000000000053dbbb in PyEval_EvalFrameEx ()
#42 0x0000000000540199 in ?? ()
#43 0x000000000053c1d0 in PyEval_EvalFrameEx ()
#44 0x000000000053b7e4 in PyEval_EvalFrameEx ()
#45 0x000000000053b7e4 in PyEval_EvalFrameEx ()
#46 0x000000000053b7e4 in PyEval_EvalFrameEx ()
#47 0x0000000000540f9b in PyEval_EvalCodeEx ()
#48 0x00000000004ebd23 in ?? ()
#49 0x00000000005c1797 in PyObject_Call ()
#50 0x00000000005c257a in _PyObject_CallMethodIdObjArgs ()
#51 0x00000000005260c8 in PyImport_ImportModuleLevelObject ()
#52 0x0000000000549e78 in ?? ()
#53 0x00000000004e9ba7 in PyCFunction_Call ()
#54 0x00000000005c1797 in PyObject_Call ()
#55 0x0000000000534d90 in PyEval_CallObjectWithKeywords ()
#56 0x000000000053a1c7 in PyEval_EvalFrameEx ()
#57 0x0000000000540199 in ?? ()
#58 0x0000000000540e4f in PyEval_EvalCode ()
#59 0x000000000054a6b8 in ?? ()
#60 0x00000000004e9c36 in PyCFunction_Call ()
#61 0x000000000053dbbb in PyEval_EvalFrameEx ()
#62 0x0000000000540199 in ?? ()
---Type &lt;return&gt; to continue, or q &lt;return&gt; to quit---
#63 0x000000000053c1d0 in PyEval_EvalFrameEx ()
#64 0x000000000053b7e4 in PyEval_EvalFrameEx ()
#65 0x000000000053b7e4 in PyEval_EvalFrameEx ()
#66 0x000000000053b7e4 in PyEval_EvalFrameEx ()
#67 0x0000000000540f9b in PyEval_EvalCodeEx ()
#68 0x00000000004ebd23 in ?? ()
#69 0x00000000005c1797 in PyObject_Call ()
#70 0x00000000005c257a in _PyObject_CallMethodIdObjArgs ()
#71 0x00000000005260c8 in PyImport_ImportModuleLevelObject ()
#72 0x0000000000549e78 in ?? ()
#73 0x00000000004e9ba7 in PyCFunction_Call ()
#74 0x00000000005c1797 in PyObject_Call ()
#75 0x0000000000534d90 in PyEval_CallObjectWithKeywords ()
#76 0x000000000053a1c7 in PyEval_EvalFrameEx ()
#77 0x0000000000540199 in ?? ()
#78 0x0000000000540e4f in PyEval_EvalCode ()
#79 0x000000000060c272 in ?? ()
#80 0x000000000060e71a in PyRun_FileExFlags ()
#81 0x000000000060ef0c in PyRun_SimpleFileExFlags ()
#82 0x000000000063fb26 in Py_Main ()
#83 0x00000000004cfeb1 in main ()</pre><p> </p>
</div></div></div>Mon, 22 Jan 2018 16:19:29 +0000mvrht, u3425923754805 at https://software.intel.com