libmesh-users

Did you ever try my barrier?
On Apr 19, 2013, at 11:48 AM, Roy Stogner <roystgnr@...> wrote:
>
> On Fri, 19 Apr 2013, Derek Gaston wrote:
>
>> Just to put an end-cap on this.... I switched over to the newest
>> mvapich (1.9b) and all of this stuff cleared up. It's still not
>> clear to me what the issue is/was.... but it's working ;-)
>
> That's a relief! Thanks for keeping us updated!
>
> It'd be nice to know what caused the issue, but probably not worth the
> massive effort it'd take to debug.
>
> I generally like how we add workarounds for compile-time problems with
> old PETSc versions and compiler versions, because that's generally
> easy enough to do... but for run-time problems with old MPI versions,
> forget about it. I've also got a "breaks-with-old-mpich2,
> works-with-new" bug on my plate here that I'm never planning to really
> investigate.
> ---
> Roy------------------------------------------------------------------------------
> Precog is a next-generation analytics platform capable of advanced
> analytics on semi-structured data. The platform includes APIs for building
> apps and a phenomenal toolset for data science. Developers can use
> our toolset for easy data analysis & visualization. Get a free account!
> http://www2.precog.com/precogplatform/slashdotnewsletter_______________________________________________
> Libmesh-devel mailing list
> Libmesh-devel@...
> https://lists.sourceforge.net/lists/listinfo/libmesh-devel

On Fri, 19 Apr 2013, Derek Gaston wrote:
> Well - it's making it past that point now... well past it. This is
> something different... and not very repeatable...
I was just about to lament that this kind of "stray pointer" error is
the worst that C/C++ have to offer, and is the same kind of error that
let me to eventually completely give up on my first big project when I
was learning C in high school.
But no, you don't just have an old-fashioned stray pointer error, you
have a new-fangled nondeterministic stray pointer error thanks to a
parallel race condition.
Technology is advancing! :-P
---
Roy

We're looking into the possibility that it's a bad node in our cluster.
Here's another core dump from a different place:
Program terminated with signal 7, Bus error.
#0 libMesh::Quad::n_vertices (this=0x2aabc0c0bcc0) at
./include/libmesh/face_quad.h:80
80 unsigned int n_vertices() const { return 4; }
(gdb) where
#0 libMesh::Quad::n_vertices (this=0x2aabc0c0bcc0) at
./include/libmesh/face_quad.h:80
#1 0x00002ab19fe62334 in libMesh::Elem::hmax (this=0x2aabc0c0bcc0) at
src/geom/elem.C:367
#2 0x00002ab19fc4ca5b in
libMesh::KellyErrorEstimator::internal_side_integration
(this=0x2aabc04eb160) at src/error_estimation/kelly_error_estimator.C:93
#3 0x00002ab19fc49730 in libMesh::JumpErrorEstimator::estimate_error
(this=0x2aabc04eb160, system=..., error_per_cell=..., solution_vector=0x0,
estimate_parent_error=false) at src/error_
Notice that it's always a signal 7, Bus error..... how could "return 4"
ever not work?
Derek
On Fri, Apr 19, 2013 at 11:13 AM, Roy Stogner <roystgnr@...>wrote:
>
> On Fri, 19 Apr 2013, Derek Gaston wrote:
>
> Well - it's making it past that point now... well past it. This is
>> something different... and not very repeatable...
>>
>
> I was just about to lament that this kind of "stray pointer" error is
> the worst that C/C++ have to offer, and is the same kind of error that
> let me to eventually completely give up on my first big project when I
> was learning C in high school.
>
> But no, you don't just have an old-fashioned stray pointer error, you
> have a new-fangled nondeterministic stray pointer error thanks to a
> parallel race condition.
>
> Technology is advancing! :-P
> ---
> Roy
>