Conda installer: using MPI

Thauwa · December 31, 2019, 10:57am

Firstly, many thanks to the developers for their hard work! I tried for several weeks to make NGSolve run on a cluster on which I had limited permissions, and the Conda installer finally solved the problem for me.

My current issue is that all my processes using mpirun have rank 0 when I try using NGSolve. Online, others say that this is due to clashing MPI libraries. However, I checked and this does not seem to be the case.

So, would someone mind clarifying:

Whether the Conda installer had MPI enabled?
How would I go about enabling MPI if it is not? Should I edit some CMake file in the installer manually and reinstall it in Conda?
What compiler/version should I use to make mpirun work best? I tried ICC 2017, 2019, GCC 2017 and OMPI.

The cluster uses CentOS7.

Thanks in advance for your time - and Happy 2020!

lkogler · January 2, 2020, 10:58am

As you have correctly recognized, when all processes have rank 0, you are usually dealing with a non-MPI version of NGSolve.

As far as I know, MPI is disabled for the anaconda installer.
I have no idea how the anaconda stuff works, sorry. Generally, for an MPI version, you have to compile NGSolve yourself.
I think there have been issues with the intel compiler in the past, so I would suggest sticking to gcc or clang. gcc versions 8.2 as well as 9.1/9.2 have bugs that break NGSolve. Use the same compilers for NGSolve and MPI. (Also, if you are using mpi4py, take care that it uses the same MPI installation as NGSolve).

Best,
Lukas

Thauwa · January 3, 2020, 1:10am

Thank you for clarifying! I, nor the supercomputer’s administrators, have been able to compile NGSolve for the system for several months due to very tough security restrictions. The Conda alternative helped alleviate this problem, but I guess I will wait for further developments. Happy 2020!

arashgmn · June 9, 2022, 9:18am

Just curious, does this bug exist in GCC>9.2? Is there a chance to get NGSolve run?

christopher · June 9, 2022, 1:15pm

The mentioned bug was fixed in 9.3 afaik