`Integrate(...)` dominates runtime in monolithic Navier–Stokes — is there a more efficient pattern?

[NGSolve] Integrate(...) dominates runtime in monolithic Navier–Stokes — is there a more efficient pattern?

Hi all, I’m profiling a monolithic incompressible NS time-stepper in NGSolve and see that “coeff eval + Integrate” dominates the wall time far more than assembly or solves.

Example timings (sum over steps):

  • dt=0.1: TOTAL 0.878 s, linear solves 0.139 s, coeff eval + Integrate 0.586 s
  • dt=0.05: TOTAL 1.588 s, linear solves 0.288 s, coeff eval + Integrate 1.116 s
  • dt=0.025: TOTAL 2.902 s, linear solves 0.556 s, coeff eval + Integrate 2.116 s
  • dt=0.0125: TOTAL 5.610 s, linear solves 1.084 s, coeff eval + Integrate 4.226 s

The hot section repeatedly integrates full-domain and boundary expressions, including gradients and nonlinear terms:

du  = grad(velocity)
Fun = CF((velocity[0]*du[0,0] + velocity[1]*du[0,1],
          velocity[0]*du[1,0] + velocity[1]*du[1,1]))

a11 = Integrate(InnerProduct(velocity, velocity), mesh) \
    - Integrate(InnerProduct(velocity, velocity1), mesh)
a21 = - Integrate(InnerProduct(Fun, velocity1), mesh)
b2  = Integrate(InnerProduct(Fun, velocity3), mesh) \
    - Integrate(0.5*InnerProduct(velocity, velocity)*InnerProduct(velocity, n), mesh, BND)

Assemble bilinearforms and replace the bilinear integrates with matrix vector products like InnerProduct(velocity.vec, mat * velocity.vec)