install NGsolve without root access

Hello all,

I am interested to install NGsolve on a local cluster, but I don’t have root access.
There are a couple of required libraries missing, including:

1, tk version >8.5 **(I only have module tcltk version 8.4.19)
2, libxmu-dev, xorg-dev, libglu
, liblapacke-dev missing

I was wondering if I could install NGsolve without using these libraries, or is there a way to install these libraries without root access. If there is no cheap way to do so, I might turn to ask the cluster helpdesk to install the missing libraries…

Best regards,
Guosheng

You can build Netgen without the GUI (CMake-option -DUSE_GUI=OFF), in which case you should not need the tcl/tk/OpenGL-libraries.

You can run all computations without the GUI and use VTKOut and for example ParaView for visualization.

If you need/want to have the GUI, in principle, you can compile all of those libraries yourself, however properly compiling and installing the OpenGL-libraries is probably pretty difficult and I have personally never made this work myself, so in that case I would recommend asking the cluster helpdesk.

If you want to use LAPACK, you will either need the liblapacke-dev package or you will have to compile a lapack-library yourself. One option which is relatively uncomplicated would be openblas (which also provides lapack-functionality).

Alternatively, you can turn LAPACK support off with the CMake-option -DUSE_LAPACK=OFF.

One more thing to consider: If you are planning on using MPI, the Netgen GUI does not currently work at all (or, at least, is very unstable).
Also, there should be a big patch with MPI-bugfixes and additional MPI-functionality coming out in the next few days, so maybe wait for that one.

Best regards,
Lukas Kogler

Hi Lukas,

So, I turned off GUI, and am able to install netgen with the python interface.
But I got a compiling error when install ngsolve. here is the error information:

[ 37%] Building CXX object fem/CMakeFiles/ngfem.dir/l2hofe.cpp.o
[ 37%] Building CXX object fem/CMakeFiles/ngfem.dir/l2hofe_trig.cpp.o
[ 38%] Building CXX object linalg/CMakeFiles/ngla.dir/elementbyelement.cpp.o
[ 39%] Building CXX object linalg/CMakeFiles/ngla.dir/arnoldi.cpp.o
[ 40%] Building CXX object fem/CMakeFiles/ngfem.dir/l2hofe_segm.cpp.o
[ 41%] Building CXX object linalg/CMakeFiles/ngla.dir/paralleldofs.cpp.o
[ 41%] Building CXX object linalg/CMakeFiles/ngla.dir/python_linalg.cpp.o
[ 42%] Building CXX object linalg/CMakeFiles/ngla.dir/umfpackinverse.cpp.o
[ 43%] Building CXX object fem/CMakeFiles/ngfem.dir/l2hofe_tet.cpp.o
[ 43%] Building CXX object fem/CMakeFiles/ngfem.dir/hcurlhofe.cpp.o
[ 44%] Building CXX object fem/CMakeFiles/ngfem.dir/hcurlhofe_hex.cpp.o
g++: internal compiler error: Killed (program cc1plus)
Please submit a full bug report,
with preprocessed source if appropriate.
See http://gcc.gnu.org/bugs.html for instructions.
make[6]: *** [fem/CMakeFiles/ngfem.dir/l2hofe_trig.cpp.o] Error 4
make[6]: *** Waiting for unfinished jobs…
[ 45%] Linking CXX shared library libngla.so
[ 45%] Built target ngla
make[5]: *** [fem/CMakeFiles/ngfem.dir/all] Error 2
make[4]: *** [all] Error 2
make[3]: *** [dependencies/Stamp/ngsolve/ngsolve-build] Error 2
make[2]: *** [CMakeFiles/ngsolve.dir/all] Error 2
make[1]: *** [CMakeFiles/ngsolve.dir/rule] Error 2
make: *** [ngsolve] Error 2

Do you know where might go wrong?
I use cmake version 3.5.2, gcc/g++ version 5.1.0, and python3.4

Best,
Guosheng

You might have run out of memory during compilation - try keeping an eye on the free memory on the node you are compiling on (ex: watch -n 0.5 free -g).

This might happen if you compile with very many threads on a node with comparatively little memory, try to compile with fewer threads (make -j 4).

Yeah, it was a memory issue.

Now, I got a build error at the very final stage:

[100%] Linking CXX executable ngs
[100%] Built target ngslib
…/comp/libngcomp.so: undefined reference to _ZNK6netgen8Ngx_Mesh26MultiElementTransformationILi2ELi2EDv4_dEEviiPKT1_mPS3_mS6_m' ../comp/libngcomp.so: undefined reference to _ZNK6netgen8Ngx_Mesh26MultiElementTransformationILi1ELi3EDv4_dEEviiPKT1_mPS3_mS6_m’
…/comp/libngcomp.so: undefined reference to _ZNK6netgen8Ngx_Mesh26MultiElementTransformationILi2ELi3EDv4_dEEviiPKT1_mPS3_mS6_m' ../comp/libngcomp.so: undefined reference to _ZNK6netgen8Ngx_Mesh26MultiElementTransformationILi1ELi1EDv4_dEEviiPKT1_mPS3_mS6_m’
…/comp/libngcomp.so: undefined reference to _ZNK6netgen8Ngx_Mesh26MultiElementTransformationILi1ELi2EDv4_dEEviiPKT1_mPS3_mS6_m' ../comp/libngcomp.so: undefined reference to _ZNK6netgen8Ngx_Mesh26MultiElementTransformationILi3ELi3EDv4_dEEviiPKT1_mPS3_mS6_m’
…/comp/libngcomp.so: undefined reference to _ZNK6netgen8Ngx_Mesh26MultiElementTransformationILi0ELi2EDv4_dEEviiPKT1_mPS3_mS6_m' ../comp/libngcomp.so: undefined reference to _ZNK6netgen8Ngx_Mesh26MultiElementTransformationILi0ELi1EDv4_dEEviiPKT1_mPS3_mS6_m’
collect2: error: ld returned 1 exit status
make[5]: *** [solve/ngs] Error 1
make[4]: *** [solve/CMakeFiles/ngs.dir/all] Error 2
make[3]: *** [all] Error 2
make[2]: *** [dependencies/Stamp/ngsolve/ngsolve-build] Error 2
make[1]: *** [CMakeFiles/ngsolve.dir/all] Error 2
make: *** [all] Error 2

Any clue what might went wrong?

Best,
Guosheng

It says that it is missing some symbols which should be in libinterface.so, which is a netgen-library.

The build-process works like this:
First, netgen is compiled, linked and installed and after that ngsolve is compiled, then the netgen-libs are linked to the ngsolve-libs where necessary and then ngsolve is installed.

The issue seems to be that the linker either cannot find libinterface or that the one it is trying to link to does not have the right symbols in it.

Where are you installing netgen/ngsolve TO? Do you have write-permissions there?
If not, cmake cannot properly copy the netgen-libraries out of the build-folder into the install-folder and then cannot find them during the linking of ngsolve. I am not sure which install-path is the default, but I think it is “/bin/…” or “/opt/…” where you do not have write-permissions. You can set the folder you are installing to with “-DCMAKE_INSTALL_PREFIX=…” .

You can check what exactly is happening with “make VERBOSE=1”, you should be able to tell exactly against which libraries the compiler is trying to link from that.

Check where the compiler thinks libinterface.so is located and doublecheck that it is actually there.

You can also check if the libinterface.so you are linking to actually has those symbols with:

nm ~/local/netgen-std/lib/libinterface.so -a | grep MultiElement

This should give you a couple of lines like these:
000000000016d10 T _ZNK6netgen8Ngx_Mesh26MultiElementTransformationILi1ELi1EdEEviiPKT1_mPS2_mS5_m
0000000000016ef0 T _ZNK6netgen8Ngx_Mesh26MultiElementTransformationILi1ELi1EDv4_dEEviiPKT1_mPS3_mS6_m
0000000000016ce0 T _ZNK6netgen8Ngx_Mesh26MultiElementTransformationILi1ELi2EdEEviiPKT1_mPS2_mS5_m

If you do not get any output here, this means that the library does not have these symbols. In that case, are you maybe trying to link against a libinterface from an old install or maybe even against a library that is called “libinterface” but might just by coincidence have the same name and actually come from somewhere completely different?

If you still have issues, could you post the output of “make VERBOSE=1” (the command that comes after the cmake-message “Linking CXX shared library libngcomp.so”?

This happens if you compile Netgen without AVX support but NGSolve with AVX support. Did you build Netgen separately (i.e. no superbuild) or play with the setting USE_NATIVE_ARCH?

Please attach the following files in your build directory (they contain all your build settings):
./CMakeCache.txt
./ngsolve/CMakeCache.txt
./netgen/CMakeCache.txt
./netgen/netgen/CMakeCache.txt

Best,
Matthias

Yeah, it was the linking issue.

Initially, I successfully installed ngsolve without MPI on a local folder ~/netgen/inst/

Then, I tried to turn on MPI, and install it on another folder ~/netgen/inst-mpi/, which caused the issue.
The issue is that at the final stage, the complier try to link the old library at the folder ~/netgen/inst/ rather than those at the folder ~/netgen/inst-mpi/

So, I changed my install directory for the MPI version back to ~/netgen/inst/ and the installation work fine now. Is there a way to specifically tell the compiler where the library to link? I don’t understand why it search the old folder…

Now, I tried to run the tutorial.
After adding “from mpi4py import *” in the python tutorial file, I can run the code with command

python3 adaptive.py
But, it gives me a segmentation fault when I run, say,
mpirun -n 4 python3 adaptive.py

If you want to use mpi4py, you have to make sure that mpi4py and ngsolve both use the exact same mpi-library, we have had issues with that in the past.

Also I think you have to import mpi4py BEFORE netgen/ngsolve because on importing, ngsolve checks if MPI has already been initialized and if not initializes, and I do not know if mpi4py likes it when somewone else has already done that.

For ngsolve to work you do not necessarily need mpi4py (anymore).

Today, there was an update for the netgen- and ngsolve master branches which featured a bunch of MPI-related bugfixes. Those are probably ESSENTIAL!
There are now also a couple of mpi-tutorial files in “ngsolve/py_tutorials/mpi/”.

You need to get the newest version of BOTH ngsolve and netgen.
Keep in mind that when you “git pull” in the ngsolve-directory, it will probably not update netgen yet, so go to “ngsolve/external_dependencies/netgen” and “git pull” there too!

Also, if you run into runtime-library-issues, use “ngspy” instead of “python3”.
ngspy is just a wrapper around python3 which preloads a couple of libraries.

And about the linking issue:
The linker usually takes the first library of any name it can find, and if ~/netgen/inst/lib is in your LD_LIBRARY_PATH, it sometimes takes the wrong one.

Do you have environment-modules on your cluster?

In that case, you could create one module “netgen” and one module “netgen-mpi” and only load one at any given time in order to properly seperate them.

OK. So I updated the library, and finally have MPI version installed.
Previously, It was the mpi location issue. I have two mpi in the cluster, one is located under the python folder that is not working properly. Say, how to specify the location of MPI in cmake? like -DMPI_ROOT=…

Now, I need to add a direct solver. I don’t have any of umpack/pardiso/mumps.

  1. I tried to install umpack as my local installation in the laptop using

“-DCMAKE_PREFIX_PATH= ~/netgen/SuiteSparse -DUSE_UMFPACK=ON”

but got an length error at the final linking stage:
…/fem/libngfem.so: undefined reference to std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::replace(unsigned long, unsigned long, char const*, unsigned long)@GLIBCXX_3.4.21' ../comp/libngcomp.so: undefined reference to std::basic_ostream<char, std::char_traits >& std::operator<< <char, std::char_traits, std::allocator >(std::basic_ostream<char, std::char_traits >&, std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&)@GLIBCXX_3.4.21
…/comp/libngcomp.so: undefined reference to std::basic_ofstream<char, std::char_traits<char> >::basic_ofstream(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::_Ios_Openmode)@GLIBCXX_3.4.21' libsolve.so: undefined reference to VTT for std::__cxx11::basic_stringstream<char, std::char_traits, std::allocator >@GLIBCXX_3.4.21’

  1. similar thing happes when I activated mumps with “-DUSE_MUMPS=ON”

  2. I tried to add pardiso with MKL:

“-DUSE_MKL=ON
-DMKL_ROOT=/soft/intel/x86_64/12.1/8.273/composer_xe_2011_sp1.8.273/mkl/”

but got a compiling error for ngsolve at very begining:
netgen/src/ngsolve/ngstd/taskmanager.cpp: In member function ‘void ngstd::TaskManager::Loop(int)’:
netgen/src/ngsolve/ngstd/taskmanager.cpp:345:32: error: ‘mkl_set_num_threads_local’ was not declared in this scope
mkl_set_num_threads_local(1);

Finally, I have convergence issue with the provided preconditioner. Running with

mpirun -np 5 ngspy mpi_poission.py
(the bddc preconditioner)
I got the following diverging result:
assemble VOL element 6697/6697
assemble VOL element 6697/6697
create masterinverse
master: got data from 4
now build graph
n = 8507
now build matrix
have matrix, now invert
start order
order … 14952360 Bytes task-based parallelization (C++11 threads) using 1 threads
factor SPD …
0 1.00669
1 0.940628
2 0.533298
3 0.540046
4 1.33798
5 1.05662

But without MPI, the method converges in 12 iterations.
I replaced the preconditioner with type “local”, then there is no convergence difference between the mpi version and non-mpi version. Is this to be expected?

Thanks in advance,
Guosheng

You really seem to encounter all the problems one could think of. First, let me point to a script I wrote for someone else to get NGSolve running on a cluster. I should have mentioned it before, maybe it helps:
https://data.asc.tuwien.ac.at/snippets/6

Now, step by step:

[quote=“Guosheng Fu” post=38]OK. So I updated the library, and finally have MPI version installed.
Previously, It was the mpi location issue. I have two mpi in the cluster, one is located under the python folder that is not working properly. Say, how to specify the location of MPI in cmake? like -DMPI_ROOT=…

Now, I need to add a direct solver. I don’t have any of umpack/pardiso/mumps.

  1. I tried to install umpack as my local installation in the laptop using

“-DCMAKE_PREFIX_PATH= ~/netgen/SuiteSparse -DUSE_UMFPACK=ON”

but got an length error at the final linking stage:
…/fem/libngfem.so: undefined reference to std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::replace(unsigned long, unsigned long, char const*, unsigned long)@GLIBCXX_3.4.21' ../comp/libngcomp.so: undefined reference to std::basic_ostream<char, std::char_traits >& std::operator<< <char, std::char_traits, std::allocator >(std::basic_ostream<char, std::char_traits >&, std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&)@GLIBCXX_3.4.21
…/comp/libngcomp.so: undefined reference to std::basic_ofstream<char, std::char_traits<char> >::basic_ofstream(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::_Ios_Openmode)@GLIBCXX_3.4.21' libsolve.so: undefined reference to VTT for std::__cxx11::basic_stringstream<char, std::char_traits, std::allocator >@GLIBCXX_3.4.21’

[/quote]
Things like this happen when C++ libraries are linked together, but they were compiled with different compilers. (e.g. Umfpack was compiled with gcc 6 and NGSolve with gcc 4.8 for instance). I just saw that I am not passing CMAKE_CXX_COMPILER to the Umfpack subproject, I will fix that.

Anyway, Umfpack doesn’t make sense on a parallel environment, so try to get MUMPS running instead.

Please give me the exact error message. Did you point to a prebuilt version of MUMPS? Otherwise it’s built automatically with NGSolve (recommended approach).

This function seems to be missing in you mkl version. Is there no newer version installed? (Yours is from 2011). If not, you can just comment out those two functions calls, they only affect shared memory parallelization and are irrelevant with MPI.

[quote=“Guosheng Fu” post=38]
Finally, I have convergence issue with the provided preconditioner. Running with

mpirun -np 5 ngspy mpi_poission.py
(the bddc preconditioner)
I got the following diverging result:
assemble VOL element 6697/6697
assemble VOL element 6697/6697
create masterinverse
master: got data from 4
now build graph
n = 8507
now build matrix
have matrix, now invert
start order
order … 14952360 Bytes task-based parallelization (C++11 threads) using 1 threads
factor SPD …
0 1.00669
1 0.940628
2 0.533298
3 0.540046
4 1.33798
5 1.05662

But without MPI, the method converges in 12 iterations.
I replaced the preconditioner with type “local”, then there is no convergence difference between the mpi version and non-mpi version. Is this to be expected?

Thanks in advance,
Guosheng[/quote]

This seems to be a bug/missing feature. bddc is calling ‘masterinverse’ at one point, which means the whole matrix is copied to the master rank and inverted there. This seems to be working only for symmetrically stored matrices. Lukas is working on it.

I hope, we can sort out all the issues before you lose your patience… :slight_smile:

Best,
Matthias

About taking the right MPI-version:

Basically, cmake is looking for a library called libmpi (and a few others, I think) and where it looks is determined (among others) by your LD_LIBRARY_PATH , which is I think searched through front to back, so make sure that the mpi-library you want to use shows up first. You can also use the cmake-variable CMAKE_PREFIX_PATH to give cmake additional hints where to look for libraries, however I am not sure in which order LD_LIBRARY_PATH and the prefix_paths are looked through.

In any case, cmake should give you shell output which exact MPI-library it will be using, and at a later date you can look up which library was used in “build_folder/CMakeCache.txt”, where there should be a couple of lines like these:

//MPI CXX libraries to link against
MPI_CXX_LIBRARIES:STRING=/home/lukas/local/openmpi-2.1-gcc-6.2/lib/libmpi.so

//MPI CXX linking flags
MPI_CXX_LINK_FLAGS:STRING=-Wl,-rpath -Wl,/home/lukas/local/openmpi-2.1-gcc-6.2/lib -Wl,–enable-new-dtags

Environment modules can be used to properly seperate different versions of MPI if available.

The last problem is definitely a bug/missing feature, I am working on it. Thank you for bringing this to our attention.

Temporary workaround:

Change
a = BilinearForm (V, symmetric=False)
to
a = BilinearForm (V, symmetric=True)

However, keep in mind that you have to put it back to symmetric=False if you want to use HYPRE.

You can also tell bddc to use MUMPS inverse (once you get that running), this should generally work.

If you are interested in what exactly the problem is, the story goes like this:

Symmetric storage means that only the lower triangular part (+the diagonal) of the matrix are stored, which saves a bit of memory.

Masterinverse collects the entire matrix to be inverted on the master proc, which then inverts it - in combination with bddc, this should be ok for smaller problems, because bddc already decreases the size of the problem that has to be solved considerably.

This collecting of the matrix is currently only properly implemented for symmetrically stored matrices, and unfortunately it does not “terminate gracefully” if called with a fully stored one. This should only require a small fix.

I created an issue on gitlab for this:
https://gitlab.asc.tuwien.ac.at/jschoeberl/ngsolve/issues/42

Hi Matthias,

Thanks for the detailed response.
Yeah, umfpack is compiled in a lower version than ngsolve, similar things happens for mumps.
I will use your setup to see what happens then.

Even without the linear solver, I am pretty happy with the current MPI version.

Best,
Guosheng

Now mumpus is installed with the same compiler with ngsolve, but I still get the following error at the final linking stage:

CMakeFiles/ngs.dir/ngs.cpp.o: In function _ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEE12_M_constructIPKcEEvT_S8_St20forward_iterator_tag.isra.32': /panfs/roc/msisoft/gcc/5.1.0/include/c++/5.1.0/bits/basic_string.tcc:223: undefined reference to std::__cxx11::basic_string<char, std::char_traits, std::allocator >::_M_create(unsigned long&, unsigned long)’
libsolve.so: undefined reference to std::__cxx11::basic_stringstream<char, std::char_traits<char>, std::allocator<char> >::basic_stringstream(std::_Ios_Openmode)' ../comp/libngcomp.so: undefined reference to VTT for std::__cxx11::basic_istringstream<char, std::char_traits, std::allocator >’
libsolve.so: undefined reference to std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::push_back(char)' ../comp/libngcomp.so: undefined reference to std::basic_istream<char, std::char_traits >& std::operator>><char, std::char_traits, std::allocator >(std::basic_istream<char, std::char_traits >&, std::__cxx11::basic_string<char, std::char_traits, std::allocator >&)’
libsolve.so: undefined reference to std::runtime_error::runtime_error(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)' ../comp/libngcomp.so: undefined reference to std::__cxx11::basic_string<char, std::char_traits, std::allocator >::resize(unsigned long, char)’
…/linalg/libngla.so: undefined reference to blacs_gridinfo_' libsolve.so: undefined reference to std::__cxx11::basic_string<char, std::char_traits, std::allocator >::_M_erase(unsigned long, unsigned long)’
…/comp/libngcomp.so: undefined reference to std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::rfind(char, unsigned long) const' libsolve.so: undefined reference to std::__cxx11::basic_stringstream<char, std::char_traits, std::allocator >::~basic_stringstream()’
libsolve.so: undefined reference to VTT for std::__cxx11::basic_ostringstream<char, std::char_traits<char>, std::allocator<char> >' libsolve.so: undefined reference to std::__cxx11::basic_stringbuf<char, std::char_traits, std::allocator >::str() const’
libsolve.so: undefined reference to std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::find(char const*, unsigned long, unsigned long) const' libsolve.so: undefined reference to VTT for std::__cxx11::basic_stringstream<char, std::char_traits, std::allocator >’
libsolve.so: undefined reference to operator delete[](void*, unsigned long)' ../comp/libngcomp.so: undefined reference to std::__cxx11::collate const& std::use_facet<std::__cxx11::collate >(std::locale const&)’
libsolve.so: undefined reference to std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::append(char const*)' libsolve.so: undefined reference to std::__cxx11::basic_string<char, std::char_traits, std::allocator >::assign(char const*)’
libsolve.so: undefined reference to std::runtime_error::runtime_error(char const*)' libsolve.so: undefined reference to std::__cxx11::basic_string<char, std::char_traits, std::allocator >::rfind(char const*, unsigned long, unsigned long) const’
libsolve.so: undefined reference to std::runtime_error::runtime_error(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)' libsolve.so: undefined reference to vtable for std::__cxx11::basic_stringbuf<char, std::char_traits, std::allocator >’
…/linalg/libngla.so: undefined reference to pzgetrs_' libsolve.so: undefined reference to vtable for std::__cxx11::basic_ostringstream<char, std::char_traits, std::allocator >’
libsolve.so: undefined reference to operator delete(void*, unsigned long)' libsolve.so: undefined reference to std::__cxx11::basic_string<char, std::char_traits, std::allocator >::_M_replace(unsigned long, unsigned long, char const*, unsigned long)’
…/linalg/libngla.so: undefined reference to blacs_gridinit_' ../comp/libngcomp.so: undefined reference to std::_cxx11::basic_string<char, std::char_traits, std::allocator >::~basic_string()’
/home/cockburn/fug/netgen/inst/lib/libinterface.so: undefined reference to std::__cxx11::basic_stringstream<char, std::char_traits<char>, std::allocator<char> >::basic_stringstream(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::_Ios_Openmode)' ../fem/libngfem.so: undefined reference to std::runtime_error::runtime_error(std::runtime_error const&)’
…/linalg/libngla.so: undefined reference to pzpotrf_' ../linalg/libngla.so: undefined reference to descinit

libsolve.so: undefined reference to std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_mutate(unsigned long, unsigned long, char const*, unsigned long)' libsolve.so: undefined reference to std::__cxx11::basic_string<char, std::char_traits, std::allocator >::swap(std::_cxx11::basic_string<char, std::char_traits, std::allocator >&)’
…/comp/libngcomp.so: undefined reference to std::basic_ostream<char, std::char_traits<char> >& std::operator<< <char, std::char_traits<char>, std::allocator<char> >(std::basic_ostream<char, std::char_traits<char> >&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)' ../comp/libngcomp.so: undefined reference to vtable for std::cxx11::basic_istringstream<char, std::char_traits, std::allocator >’
libsolve.so: undefined reference to std::__cxx11::basic_ostringstream<char, std::char_traits<char>, std::allocator<char> >::~basic_ostringstream()' ../linalg/libngla.so: undefined reference to pzgetrf

…/fem/libngfem.so: undefined reference to std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::replace(unsigned long, unsigned long, char const*, unsigned long)' ../linalg/libngla.so: undefined reference to pdpotrf

libsolve.so: undefined reference to vtable for std::__cxx11::basic_stringstream<char, std::char_traits<char>, std::allocator<char> >' ../linalg/libngla.so: undefined reference to pdgetrs

…/ngstd/libngstd.so: undefined reference to std::basic_istream<char, std::char_traits<char> >& std::getline<char, std::char_traits<char>, std::allocator<char> >(std::basic_istream<char, std::char_traits<char> >&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&, char)' libsolve.so: undefined reference to std::__cxx11::basic_string<char, std::char_traits, std::allocator >::_M_append(char const*, unsigned long)’
libsolve.so: undefined reference to std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::compare(char const*) const' libsolve.so: undefined reference to std::__cxx11::basic_string<char, std::char_traits, std::allocator >::reserve(unsigned long)’
…/comp/libngcomp.so: undefined reference to std::__cxx11::basic_istringstream<char, std::char_traits<char>, std::allocator<char> >::~basic_istringstream()' ../comp/libngcomp.so: undefined reference to std::__cxx11::basic_string<char, std::char_traits, std::allocator >::compare(std::cxx11::basic_string<char, std::char_traits, std::allocator > const&) const’
…/linalg/libngla.so: undefined reference to blacs_gridexit_' ../linalg/libngla.so: undefined reference to pdpotrs

…/comp/libngcomp.so: undefined reference to std::__cxx11::basic_stringbuf<char, std::char_traits<char>, std::allocator<char> >::_M_sync(char*, unsigned long, unsigned long)' ../linalg/libngla.so: undefined reference to pzpotrs

…/comp/libngcomp.so: undefined reference to std::basic_ofstream<char, std::char_traits<char> >::basic_ofstream(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::_Ios_Openmode)' libsolve.so: undefined reference to std::__cxx11::basic_string<char, std::char_traits, std::allocator >::find(char, unsigned long) const’
libsolve.so: undefined reference to std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_assign(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)' ../comp/libngcomp.so: undefined reference to std::__cxx11::basic_string<char, std::char_traits, std::allocator >::erase(unsigned long, unsigned long)’
…/ngstd/libngstd.so: undefined reference to std::thread::_M_start_thread(std::shared_ptr<std::thread::_Impl_base>, void (*)())' libsolve.so: undefined reference to std::__cxx11::basic_string<char, std::char_traits, std::allocator >::append(std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&, unsigned long, unsigned long)’
…/comp/libngcomp.so: undefined reference to std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_construct(unsigned long, char)' libsolve.so: undefined reference to std::__cxx11::basic_ostringstream<char, std::char_traits, std::allocator >::basic_ostringstream(std::_Ios_Openmode)’
…/linalg/libngla.so: undefined reference to numroc_' libsolve.so: undefined reference to std::__cxx11::basic_string<char, std::char_traits, std::allocator >::M_replace_aux(unsigned long, unsigned long, unsigned long, char)’
libsolve.so: undefined reference to std::runtime_error::runtime_error(char const*)' ../linalg/libngla.so: undefined reference to pdgetrf

collect2: error: ld returned 1 exit status
make[5]: *** [solve/ngs] Error 1
make[4]: *** [solve/CMakeFiles/ngs.dir/all] Error 2
make[3]: *** [all] Error 2
make[2]: *** [dependencies/Stamp/ngsolve/ngsolve-build] Error 2
make[1]: *** [CMakeFiles/ngsolve.dir/all] Error 2
make: *** [all] Error 2

Just to add an additional comment.

Now, I finally got consistent result on the installation error. My previously working installation without mumps crashes with the same message :<

It fails to generate the dynamic library libsolve.so at the final stage, and the following error message appears no matter whether MPI is turned on/off or mumps is turned on/off:
CMakeFiles/ngs.dir/ngs.cpp.o: In function main': ngs.cpp:(.text.startup+0xda): undefined reference to std::__cxx11::basic_string<char, std::char_traits, std::allocator >::_M_create(unsigned long&, unsigned long)’
libsolve.so: undefined reference to `std::__cxx11::basic_stringstream<char, std::char_traits, std::allocator >::basic_stringstream(std::_Ios_Openmode)’

I tried to use different python3 libraries, or different gcc compilers, but no good luck.

Continue with a positive update.

I realized the issue is indeed incompatible of gcc compiler version as Matthias pointed out.
I used an mpi module mpich which seems to be only compatible with gcc 4.9.2, because when I call module avail mpich, it gives me
mpich/3.1.4/gnu-4.9.2

After changing my gcc compiler to version 4.9.2, I am able to install and run ngsolve with MPI, still without direct solver installed.
I changed the compiler to version 5.1.0 or 6.1.0, then the same …std_cxx11… error appears.

Now, coming back to mumps issue. After turning on mumps, I have parmetis and mumps installed with the same gcc 4.9.2 compiler, but it gives me the following error when building the libsolve.so library at the final step

…/linalg/libngla.so: undefined reference to blacs_gridinfo_' ../linalg/libngla.so: undefined reference to pzgetrs_’
…/linalg/libngla.so: undefined reference to blacs_gridinit_' ../linalg/libngla.so: undefined reference to pzpotrf_’
…/linalg/libngla.so: undefined reference to descinit_' ../linalg/libngla.so: undefined reference to pzgetrf_’
…/linalg/libngla.so: undefined reference to pdpotrf_' ../linalg/libngla.so: undefined reference to pdgetrs_’
…/linalg/libngla.so: undefined reference to blacs_gridexit_' ../linalg/libngla.so: undefined reference to pdpotrs_’
…/linalg/libngla.so: undefined reference to pzpotrs_' ../linalg/libngla.so: undefined reference to numroc_’
…/linalg/libngla.so: undefined reference to `pdgetrf_’

I typed
nm ngsolve/linalg/libngla.so -a | grep blacs
and get
U blacs_gridexit_
U blacs_gridinfo_
U blacs_gridinit_
Similar thing for the other reference names, all start with a captal U
What does this mean?

We are getting closer to a running version. The problem now is that libscalapack is not linked to libngla. In case you are linking with MKL this should be handled automatically. So I assume you disabled USE_MKL, right?

Without MKL you have to set scalapack manually, e.g. by configuring with
-DSCALAPACK_LIBRARY=/usr/lib/libscalapack.so

Regards,
Matthias

Yeah, MKL was turned off.
According to the info from the cluster, scalapack is available through MKL.
I have many versions of MKL, but non of which seems to work properly for me.

The first issue is that it can not find blacs library (all the others are found according to ccmake):
MKL_BLACS_LIBRARY-NOTFOUND

But there are plenty of blacs library in the mkl/lib folder:
libmkl_blacs_ilp64.a libmkl_blacs_intelmpi_ilp64.so libmkl_blacs_intelmpi_lp64.so libmkl_blacs_openmpi_ilp64.a libmkl_blacs_sgimpt_ilp64.a libmkl_blas95_ilp64.a
libmkl_blacs_intelmpi_ilp64.a libmkl_blacs_intelmpi_lp64.a libmkl_blacs_lp64.a libmkl_blacs_openmpi_lp64.a libmkl_blacs_sgimpt_lp64.a libmkl_blas95_lp64.a

I manually pointed to a blacs library, say libmkl_blacs_lp64.a, then the code compiles and installs, but is a broken version.

I run a test, say
ngspy poisson.py
It gives me two error messages
In the beginning:
ERROR: ld.so: object ‘MKL_BLACS_LIBRARY-NOTFOUND’ from LD_PRELOAD cannot be preloaded: ignored.

and another error message in the end:
python3: symbol lookup error: /soft/intel/x86_64/2015/composer_xe_2015_msi/composer_xe_2015.3.187/mkl/lib/intel64/libmkl_gnu_thread.so: undefined symbol: omp_get_num_procs

You can ignore the first error message from LD_PRELOAD, since you linked MKL_BLACS statically.
A quick guess for the second one:
Try to set MKL_THREADING_LAYER=GNU

Cheers,
Matthias