Dear all,
I would like to understand if there is a standard approach to understand why ( after bookkeping many operations on a RDataFrame , several Take, Filter, Define,Histo1D,Histo2D for O(4500) ) and mapping the results to nested std::map<RResultPtr<double,bool,TH1D>> or RNode, i see crashes of the code with messages like the one posted at the end of the thread.
I tried reducing the complexity of the tasks ( bookkeping 100 for example) and it was working, i tried changing from EnableImplicitMT(10) to (4) on lxplus for the complex case and i still get the error.
The segfault happens inside the event loop when i call*df.Count()
and i wonder if there are any tools or switch which can help me to understand what is going wrong.

I want to investigate why and which bookkeped operation is the source of the problem or if calling many Take blows up memory and trigger the segfault for example.
Any suggestion how to understand how to debug such problem would be helpful.

I added a progress bar and the code breaks before starting it so i think the crash happens in the compilation of the expressions i am passing around.
I don’t know how to understand what is going on, i tried reducing the amoung of histograms the code has to produce from O(4000) to O(100) and it works. I don’t know if I am hitting a liti case where the RDataFrame just crash because i use too much memory.

Hi,
happy to hear you solved the problem.
When no jitting is involved, you can debug an RDF program just like any other C++ program. Just-in-time compiled expressions, however, are not accessible to the debugger, so more rudimentary “black box” approaches like helper printouts are often needed. What I typically do is try to find a minimal reproducer and then remove all jitting so I can simply use a debugger.

if a given selection uses saying >40 obserables, how one can remove the jitting and run it in compiled mode?

That would need to become a function with 40 parameters.

The only feature I have in mind that might help in these situations is a GetFilterCode or GetDefineCode method that takes the expression as string and prints out the corresponding C++ code that would be just-in-time compiled. It might come at some point this year. We are open to suggestions though!

@bdrum, yes to debug one can minimize the amount of branches to use, But what if one has O(100) branches used in the selection ?
Write a lambda with 100 arguments seems to be an overkill, and i guess this is the typical use case of RDataFrame. I agree that to minimize the problem one can re-write the selection to be way simpler, but in my analysis framwork the selection is “generated” by some code , so i need to modify my analysis selection to effectively debug it.

Just a tough, it might be quite complex to implement but in principle in a lengthy expression one has almost always parenthesis around and operators && , || .
I don’t know how helpful can be but already “decouple a 100 branches” dependent selection with O(30) operators into small pieces could help to automatize and avoid in-time compilation, just separating selection bits around , defining intermediate columns which are cutted later. Don’t know if that’s a stupid idea.

The problem with that is, users can write any valid C++ in filter/define expression strings. And sometimes users do more complex things than concatenating a bunch of simple operations. So in general we would need to keep the just-in-time compilation mechanism (and as soon as strings are just-in-time compiled, debuggers are blind). We could add an additional mechanism to manually parse the simpler expressions, break them down in simple operations and call the right functions by hand, but that would probably result in extremely inefficient event loops.

If your framework is generating code anyway, you could generate full C++ functions instead of just the selection expression and then pass that (plus a vector of 100 branch names, also dynamically generated by your framework) to RDF.

Note that you are already making an advanced usage of RDF, so I am proposing to get even slightly more advanced in order to reap large benefits: debuggers will work, and compiling with optimizations will get you a very large speed-up (I have seen up to 4x), because jitted code is not compiler-optimized.

In cmake it’s enough to specify -DCMAKE_BUILD_TYPE=Release, which I think translates to -O3 (-O2 would be sufficient too). Note that the 4x is for RDF code that spends most of its time within just-in-time compiled code vs RDF code that spend most of its time within code compiled with optimizations. See e.g. here. As per the RDF docs, expression strings are handy but they have a performance penalty.