How to accelerate GridFunction.Set() with H-matrix-based CoefficientFunction

Hello everyone,

I’m working on integrating Radia (a magnetostatic field computation library) with NGSolve. Radia uses H-matrices (hierarchical matrices) for acceleration on large problems (N>1000 elements).

Problem

I’ve implemented a custom CoefficientFunction (RadiaField) that evaluates Radia’s magnetic field on NGSolve meshes:

import rad_ngsolve
# Get Radia field as CoefficientFunction
B_cf = rad_ngsolve.RadiaField(magnet, 'b')
# Set to GridFunction in HCurl space
fes = HCurl(mesh, order=1)
gf = GridFunction(fes)
gf.Set(B_cf) # ← This is slow

Current performance:

  • Mesh: 5034 vertices, 1260 elements
  • Execution time: 1500 ms
  • Radia’s batch evaluation (1000 points): 6.36x speedup possible

However, GridFunction.Set() evaluates 4 points at a time (per element), so we cannot leverage batch evaluation:

// NGSolve's internal behavior (pseudo-code)
for (element in mesh) {
BaseMappedIntegrationRule mir(4 points); // 4 points per element
cf.Evaluate(mir, result); // ← Called 1260 times
}

Issue:

  • Radia’s batch evaluation: 6.36x faster for 1000 points
  • But NGSolve only passes 4 points at a time
  • Result: Only 5% improvement (theoretically 6x is possible)

About H-matrices

Radia uses H-matrices for large problems:

  • Construction: O(N log N) complexity
  • Field evaluation (rad.Fld): Direct summation O(M × N)
  • Solver (rad.Solve): H-matrix acceleration O(N log N)

Important note: H-matrices are NOT used for field evaluation. This is the same in BEM libraries like ngbem.

Proposed Solution

Implement a custom Set() method that evaluates all vertices at once to maximize batch evaluation benefits:

# Proposed new API
gf.SetBatch(B_cf) # Evaluate all vertices at once

Expected benefits:

  • Evaluation calls: 1260 → 1
  • Batch size: 4 points → 5034 points (all vertices)
  • Expected speedup: 3-6x (when Radia’s batch evaluation is effective)

Implementation

I provide an implementation example in C++ code (SetBatch.cpp). The main idea:

  1. Collect all vertex coordinates
  2. Single batch call to evaluate CoefficientFunction
  3. Set results appropriately to GridFunction
    See the separate file (SetBatch.cpp) for implementation details.

SetBatch.cpp (14.3 KB)

Questions

  1. Is it feasible to modify GridFunction.Set() to use full-mesh batch evaluation, given NGSolve’s internal architecture?
  2. For HCurl spaces, how should we set values to edge DOFs rather than vertices?
  3. Are there existing NGSolve APIs for more efficient batch evaluation?
  4. Would such an optimization be valuable to include in NGSolve core?

Environment

  • NGSolve: 6.2.2506

Any advice is welcome. I’m particularly interested in hearing from those familiar with NGSolve’s internal structure.

maybe an idea: Use a IntegrationRuleFESpace (1 dof per integration point) to evaluate the h matrix and then either directly use this gf or interpolate this onto an arbitrary space.