Using MPI without building from source

Hello,

I have been trying to run NGsolve in parallel on an HPC and I’d like to know if it’s possible to use the launchpad download of NGsolve with MPI? I have been trying to compile it from source both on the cluster and in containers, but have run into errors on both that I have been unable to fix.

Thanks

Hello,

The setup on HPC clusters varies a lot, thus we do not offer prebuilt binaries for such environments.
Often, the default compiler on clusters is too old, which OS/compiler were you using?
For further hints I need your configuration (cmake) command and the complete command line output.

Best,
Matthias

The best attempt I’ve had so far is building NGSolve in a singularity container with an Ubuntu environment. In it, I’ve installed the packages that are listed on the “Build on Linux” page as well as openmpi-bin, libopenmpi-dev, and numpy/scipy. My cmake command is:

cmake -DUSE_MPI=ON -DUSE_GUI=OFF -DCMAKE_INSTALL_PREFIX=${BASEDIR}/ngsolve-install ${BASEDIR}/ngsolve-src

The error message that I’ve received is quite long and I don’t know what parts are relevant, so I’ll attach a text file of the whole message. However, I think the important line is:

error: inlining failed in call to always_inline '__m256d _mm256_fmadd_pd(__m256d, __m256d, __m256d)': target specific option mismatch

I’ve searched for this error message myself and I’ve seen people suggest adding flags like “-msse4.1”, “-march=native”, “-march=nehalem”, and “-mavx” to CMAKE_CXX_FLAGS. I’ve tried this and have still gotten the same error.

Thank you for your willingness to help.

https://ngsolve.org/media/kunena/attachments/1252/Error.txt

Attachment: Error.txt

edit ngsolve/ngstd.simd.hpp, line 1047

replace #ifdef AVX2 by
#ifdef FMA

and again in line 1065

Joachim

That did the trick. Thank you very much for your help.

Sorry to bother you again. Everything in the container is built and I’ve moved it to the HPC. I can successfully run the MPI tutorials provided in the source, but when I try to run my program, I get segmentation faults. Specifically, I get:

[node4][[23534,1],3][btl_tcp_frag.c:237:mca_btl_tcp_frag_recv] mca_btl_tcp_frag_recv: readv failed: Connection reset by peer (104) [node4][[23534,1],4][btl_tcp_frag.c:237:mca_btl_tcp_frag_recv] mca_btl_tcp_frag_recv: readv failed: Connection reset by peer (104) Caught SIGSEGV: segmentation fault Collecting backtrace... #1 /opt/ngsuite/ngsolve-install/lib/python3/dist-packages/netgen/../../../libngcore.so(+0x1864b) [0x7fa70f33664b] #2 /lib/x86_64-linux-gnu/libc.so.6(+0x43f60) [0x7fa7109bef60]

I have attached a copy of the code that I used.

Thank you for your help.

https://ngsolve.org/media/kunena/attachments/1252/Nanosphere.py

Attachment: Nanosphere_2019-11-11.py