Performance Issue using float vs. np.float64

Schwering · February 4, 2025, 10:40am

Hello everybody,

I just came across some unexpected performance behavior of NGSolve that might be interesting. When building a matrix from two existing matrices, it makes a huge difference if one multiplies with a Python float or a numpy.float64.

I created a minimal working example (MWE) to test on your machine.

MWE_AddMat.py (1.2 KB)

On mine, the code

Mstar.AsVector().data = np_float * blf.mat.AsVector() + blf.mat.AsVector()

takes approximately 300 times longer than

Mstar.AsVector().data = python_float * blf.mat.AsVector() + blf.mat.AsVector()

where python_float and np_float describe the same value and just differ in data type. Furthermore, adding the second matrix (without any scalar multiplication with it) also takes significantly longer when the first matrix is multiplied by the numpy float.

To overcome the issue, one can simply convert the numpy float to a Python float using float(np_float).

I don’t know if this issue is solvable on the C++ side of NGSolve.

Greetings,
Paul

joachim · February 4, 2025, 7:53pm

Hi Paul,
thank you for pointing out this performance issue!

To see what is going to happen print the types

print (type(np_float * blf.mat.AsVector()))

and

print (type(python_float * blf.mat.AsVector()))

In the first case, NGSolve handles float * vector, the type is an NGSolve-expression. In the second, we let numpy do the computations.
Both should be reasonable choices, if you like TaskManager parallelization you may prefer the first one.
However, there was a bottle-neck: The BaseVector did not have a buffer-protocol yet, which did not allow an (automatic) efficient conversion to numpy, and the Python getitem is called for every index . I just added the buffer-protocol.
A quick fix is to use blf.mat.AsVector().FV() viewing the BaseVector as a FlatVector, which already had the buffer protocol.

best,
Joachim

Schwering · February 5, 2025, 8:16am

Thanks for the quick answer and the fix. This explains why my code was so slow