Segmentation fault: MPI + OCCGeometry from step file

Dear all,

I get a segmentation fault, if I use parallelization via MPI and an OCCGeometry from a step file.
The minimal example is based on your template.
poisson_parallel_step.py (777 Bytes)
It runs fine for any number of processors if it uses unit_cube from netgen.occ:

mpiexec -np 4 python poisson_parallel_step.py

However, if it loads the geometry from a step file (Cube.step (10.0 KB)) via

mpiexec -np 4 python poisson_parallel_step.py step

if fails with a segfault for any number of processors > 1:

[EPYC:3092124] *** Process received signal ***
[EPYC:3092124] Signal: Segmentation fault (11)
[EPYC:3092124] Signal code: Address not mapped (1)
[EPYC:3092124] Failing at address: 0x8
[EPYC:3092124] [ 0] /lib/x86_64-linux-gnu/libc.so.6(+0x45320)[0x7ec13ce45320]
[EPYC:3092124] [ 1] /home/user/miniconda3/envs/ngsolve_mpi/lib/python3.12/site-packages/ngsolve/../netgen_mesher.libs/libngcomp.so(+0x72149c)[0x7ec12d52149c]
[EPYC:3092124] [ 2] /home/user/miniconda3/envs/ngsolve_mpi/lib/python3.12/site-packages/pyngcore/../netgen_mesher.libs/libngcore.so(_ZN6ngcore11TaskManager9CreateJobERKSt8functionIFvRNS_8TaskInfoEEEi+0xe4)[0x7ec13876f724]
[EPYC:3092124] [ 3] /home/user/miniconda3/envs/ngsolve_mpi/lib/python3.12/site-packages/ngsolve/../netgen_mesher.libs/libngcomp.so(_ZN6ngcomp15IterateElementsERKNS_7FESpaceEN5ngfem4VorBERN6ngcore9LocalHeapERKSt8functionIFvNS0_7ElementES7_EE+0x337)[0x7ec12d521b37]
[EPYC:3092124] [ 4] /home/user/miniconda3/envs/ngsolve_mpi/lib/python3.12/site-packages/ngsolve/../netgen_mesher.libs/libngcomp.so(_ZN6ngcomp14S_BilinearFormIdE10DoAssembleERN6ngcore9LocalHeapE+0x1533)[0x7ec12d48fbe3]
[EPYC:3092124] [ 5] /home/user/miniconda3/envs/ngsolve_mpi/lib/python3.12/site-packages/ngsolve/../netgen_mesher.libs/libngcomp.so(_ZN6ngcomp12BilinearForm8AssembleERN6ngcore9LocalHeapE+0xb6)[0x7ec12d42bd96]
[EPYC:3092124] [ 6] /home/user/miniconda3/envs/ngsolve_mpi/lib/python3.12/site-packages/ngsolve/../netgen_mesher.libs/libngcomp.so(+0xebe74c)[0x7ec12dcbe74c]
[EPYC:3092124] [ 7] /home/user/miniconda3/envs/ngsolve_mpi/lib/python3.12/site-packages/ngsolve/../netgen_mesher.libs/libngcomp.so(+0x7027c1)[0x7ec12d5027c1]
[EPYC:3092124] [ 8] python[0x54c584]
[EPYC:3092124] [ 9] python(_PyObject_MakeTpCall+0x2fb)[0x51da5b]
[EPYC:3092124] [10] python(_PyEval_EvalFrameDefault+0x6d3)[0x528303]
[EPYC:3092124] [11] python(PyEval_EvalCode+0xae)[0x5e469e]
[EPYC:3092124] [12] python[0x60aae7]
[EPYC:3092124] [13] python[0x605cc7]
[EPYC:3092124] [14] python[0x61e022]
[EPYC:3092124] [15] python(_PyRun_SimpleFileObject+0x1b0)[0x61d960]
[EPYC:3092124] [16] python(_PyRun_AnyFileObject+0x43)[0x61d753]
[EPYC:3092124] [17] python(Py_RunMain+0x303)[0x6167e3]
[EPYC:3092124] [18] python(Py_BytesMain+0x39)[0x5cfa89]
[EPYC:3092124] [19] /lib/x86_64-linux-gnu/libc.so.6(+0x2a1ca)[0x7ec13ce2a1ca]
[EPYC:3092124] [20] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x8b)[0x7ec13ce2a28b]
[EPYC:3092124] [21] python[0x5cf8b9]
[EPYC:3092124] *** End of error message ***
--------------------------------------------------------------------------
prterun noticed that process rank 1 with PID 3092124 on node EPYC exited on
signal 11 (Segmentation fault).
--------------------------------------------------------------------------

It works for one processor and gives correct results. Can you reproduce the error?
Thanks!
Christoph

Hi thanks for reporting this. This is a bug in distribute mesh when there are empty strings for entity names (solids, faces,…). Fix is here on master:

As a workaround you can just delete the empty names from your geometry for older versions:


if len(sys.argv) > 1 and sys.argv[1] == "step":
    geo = OCCGeometry("Cube.step")
    for s in geo.shape.solids + geo.shape.faces + geo.shape.edges + geo.shape.vertices:
        if s.name == "":
            s.name = None
    geo = OCCGeometry(geo.shape)
    ngmesh = geo.GenerateMesh(maxh=0.1, comm=comm)

best Christopher

1 Like