gen - add support for mixed precision operators #1853

zatkins-dev · 2025-07-10T23:16:45Z

Adds support for mixed precision operators, set at creation time, for the CUDA gen backend. This is made possible by defining a second scalar type CeedScalarCPU, which is always double precision, and changing the CeedScalar type during the JiT pass if the CEED_JIT_MIXED_PRECISION define is set.

The inputs to the device kernels are always CeedScalarCPU arrays, to avoid having to muck around with multiple pointers and such in a CeedVector. In gen, we only do things to the input and output arrays at the beginning and end of the kernel, so all of the computation will be happening with the single precision CeedScalar arrays we copy values into. This approach minimizes the code differences between mixed and full precision runs, essentially just requiring the helper functions to have extra template parameters to ensure the input types are correct.

The support for mixed precision operators is at the backend level, while the actual usage of mixed precision operations is defined per-operator to provide maximal flexibility.

This can be extended to the CUDA ref backend too, though the benefits will likely be more mild.

@jeremylt and @nbeams, does this seem like a reasonable approach?

zatkins-dev · 2025-07-11T15:38:50Z

Should we change the way flops are reported (say, cut them in half) for mixed precision operators?

jrwrigh · 2025-07-11T20:44:06Z

Should we change the way flops are reported (say, cut them in half) for mixed precision operators?

I don't think so; a FLOP is a FLOP. There can be some other mechanism for adjusting that if the end-user wants to, but by default I don't think the count should be changed.

jrwrigh · 2025-07-11T20:46:53Z

interface/ceed-operator.c

+
+  @ref User
+**/
+int CeedOperatorSetMixedPrecision(CeedOperator op) {


This should take a bool if we want to go with this interface.

I think a better interface would specify the precision that the operator will use internally via an enum of somesort. This will make it more future proof in case we want to experiment with other precisions. Something like CEED_PRECISION_SINGLE, CEED_PRECISION_DOUBLE, etc. (if such an enum doesn't already exist).

Yeah I agree. I'll think about the best way to support it; I can't imagine a use case for using a higher precision for the operator than for the entire solve, and it needs to be able to be a compile time constant for the JiT to work

Okay I think changing that makes the interface way more clear. Also, I can think of a use case; numerically unstable implementations of QFunctions that people aren't interesting in fixing.

Just updated, lmk what you think

jrwrigh · 2025-07-11T20:52:00Z

interface/ceed-basis.c

@@ -1303,7 +1303,7 @@ int CeedSymmetricSchurDecomposition(Ceed ceed, CeedScalar *mat, CeedScalar *lamb

  // Reduce sub and super diagonal
  CeedInt    p = 0, q = 0, itr = 0, max_itr = n * n * n * n;
-  CeedScalar tol = CEED_EPSILON;
+  CeedScalar tol = 10 * CEED_EPSILON;


Stray temp fix?

no that's a real fix. It isn't able to get to machine precision, now that CEED_EPSILON is machine precision

probably worth putting into a separate PR if it applies to main?

it technically only applies here, since CEED_EPSILON is a bit bigger on main, but fair

zatkins-dev added performance CUDA 0-WIP labels Jul 10, 2025

zatkins-dev force-pushed the zach/mixed-precision branch 2 times, most recently from 84d2396 to c08437d Compare July 11, 2025 15:35

zatkins-dev self-assigned this Jul 11, 2025

jrwrigh reviewed Jul 11, 2025

View reviewed changes

zatkins-dev added 4 commits July 11, 2025 15:42

gen - add support for mixed precision CUDA operators

4b254d5

Tweak tolerances

766d575

change CEED_EPSILON to true constant to appease bindgen

4be129e

Ensure DBL_EPSILON and FLT_EPSILON are defined

767aa77

zatkins-dev force-pushed the zach/mixed-precision branch from 4a7a755 to 767aa77 Compare July 11, 2025 21:42

Change operator precision to a more flexible interface

873a330

zatkins-dev added 1-In Review and removed 0-WIP labels Jul 16, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

gen - add support for mixed precision operators #1853

gen - add support for mixed precision operators #1853

Uh oh!

zatkins-dev commented Jul 10, 2025

Uh oh!

zatkins-dev commented Jul 11, 2025

Uh oh!

jrwrigh commented Jul 11, 2025

Uh oh!

jrwrigh Jul 11, 2025

Uh oh!

jrwrigh Jul 11, 2025

Uh oh!

zatkins-dev Jul 11, 2025

Uh oh!

zatkins-dev Jul 11, 2025

Uh oh!

zatkins-dev Jul 11, 2025

Uh oh!

jrwrigh Jul 11, 2025

Uh oh!

zatkins-dev Jul 11, 2025

Uh oh!

jeremylt Jul 11, 2025

Uh oh!

zatkins-dev Jul 11, 2025

Uh oh!

Uh oh!

gen - add support for mixed precision operators #1853

Are you sure you want to change the base?

gen - add support for mixed precision operators #1853

Uh oh!

Conversation

zatkins-dev commented Jul 10, 2025

Uh oh!

zatkins-dev commented Jul 11, 2025

Uh oh!

jrwrigh commented Jul 11, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!