Skip to content

Conversation

@fw-immunant
Copy link
Contributor

Currently this is just the C-side implementation; I'll add commits moving the impl of Rust splitting and merging from the crisp repository tomorrow.

@fw-immunant fw-immunant force-pushed the fw/c2rust-postprocess-split-merge branch from 16f6f85 to 4ffc28d Compare November 26, 2025 02:11
@ahomescu
Copy link
Contributor

Not against this approach per se, just wanted to bring up the alternative approach of using the existing c2rust::src_loc annotations for this. Do we not want to use those?

@thedataking thedataking force-pushed the feature/c2rust-postprocess-take-two branch 2 times, most recently from d2cf708 to 5b517cb Compare November 27, 2025 01:40
@fw-immunant
Copy link
Contributor Author

Not against this approach per se, just wanted to bring up the alternative approach of using the existing c2rust::src_loc annotations for this. Do we not want to use those?

I discussed this with @thedataking in a call.

One problem is that we only have the c2rust::src_loc annotations if the transpiler was invoked with --reorganize-definitions, which may not always be the case. We could just say that we require this flag (or another that would also cause these attributes to be emitted) for comment reinsertion, but then it may be the case that these attributes themselves are surprising to the LLM and damage performance. Emitting this information out of band keeps this functionality independent and simplifies using it in other tools; our other tooling on the CRISP side currently handles a similar format of a flat JSON string->string map.

The other problem is that even given said annotations, we only know the start of the C definition, not its entire bounds (nor bounds of any relevant preceding comments). Most of the logic here is in getting precise bounds from the information in the Clang AST, which is otherwise lost. So we would need to add that logic somewhere even if we were going to use the c2rust::src_loc attribute for this.

@fw-immunant fw-immunant force-pushed the fw/c2rust-postprocess-split-merge branch from e751b54 to e51268a Compare December 1, 2025 14:46
@fw-immunant
Copy link
Contributor Author

We can't practically integrate the Rust splitting/merging tools into our Cargo.toml because they use Rust 2024 and have dependencies that also do. But I don't think it's a problem to just have them live in their own subdirectories for now; they just have to be built before the c2rust-postprocess tool can invoke them.

Copy link
Contributor

@kkysen kkysen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can't practically integrate the Rust splitting/merging tools into our Cargo.toml because they use Rust 2024 and have dependencies that also do. But I don't think it's a problem to just have them live in their own subdirectories for now; they just have to be built before the c2rust-postprocess tool can invoke them.

That seems fine for now. When we get around to #1227, we can add it back.

@thedataking thedataking force-pushed the feature/c2rust-postprocess-take-two branch 7 times, most recently from 85b2bc4 to 4fe4d8d Compare December 4, 2025 02:43
@fw-immunant fw-immunant force-pushed the fw/c2rust-postprocess-split-merge branch from e51268a to edcb0bf Compare December 4, 2025 19:28
@fw-immunant fw-immunant changed the base branch from feature/c2rust-postprocess-take-two to master December 4, 2025 19:29
@fw-immunant fw-immunant changed the title [stacked] Add definition splitting to transpiler Add definition splitting to transpiler Dec 4, 2025
@fw-immunant
Copy link
Contributor Author

Hm, I guess this somehow upsets the comments c2rust test, will investigate.

@thedataking
Copy link
Contributor

thedataking commented Dec 5, 2025

I pulled down this branch and built it on lua (newest from git, not the one in tree) and got this:

$ c2rust transpile ~/Work/lua/compile_commands.json --binary lua --emit-c-decl-map
warning: unknown warning option '-Wlogical-op'; did you mean '-Wlong-long'? [-Wunknown-warning-option]
warning: unknown warning option '-Wno-aggressive-loop-optimizations' [-Wunknown-warning-option]
warning: unknown warning option '-Wlogical-op'; did you mean '-Wlong-long'? [-Wunknown-warning-option]
warning: unknown warning option '-Wno-aggressive-loop-optimizations' [-Wunknown-warning-option]
2 warnings generated.
Transpiling lcode.c
failed to convert top-level decl CDeclId(15970)!
failed to convert top-level decl CDeclId(15996)!
failed to convert top-level decl CDeclId(17033)!
thread 'main' panicked at 'slice index starts at 50147 but ends at 36146', library/core/src/slice/index.rs:92:5
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

update: I tried the in-tree lua version too:

$ c2rust transpile tests/integration/tests/lua/compile_commands.json --binary lua --overwrite-existing --emit-c-decl-map
Transpiling lbaselib.c
failed to convert top-level decl CDeclId(3263)!
thread 'main' panicked at 'slice index starts at 1031 but ends at 903', library/core/src/slice/index.rs:92:5

also tried libxml2 in-tree:

❯ ./target/release/c2rust transpile tests/integration/tests/libxml2/compile_commands.json --emit-c-decl-map --binary runtest
Transpiling runtest.c
failed to convert top-level decl CDeclId(12370)!
failed to convert top-level decl CDeclId(12375)!
failed to convert top-level decl CDeclId(12385)!
failed to convert top-level decl CDeclId(12388)!
failed to convert top-level decl CDeclId(12392)!
failed to convert top-level decl CDeclId(12631)!
failed to convert top-level decl CDeclId(12663)!
failed to convert top-level decl CDeclId(14276)!
failed to convert top-level decl CDeclId(26708)!
failed to convert top-level decl CDeclId(26711)!
thread 'main' panicked at 'slice index starts at 43298 but ends at 1471', library/core/src/slice/index.rs:92:5
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

hacky workaround got me some .json files:

diff --git a/c2rust-transpile/src/c_ast/mod.rs b/c2rust-transpile/src/c_ast/mod.rs
index 2041ac32c..3d60bbe6d 100644
--- a/c2rust-transpile/src/c_ast/mod.rs
+++ b/c2rust-transpile/src/c_ast/mod.rs
@@ -310,6 +310,10 @@ impl TypedAstContext {
         for (decl_id, decl) in &self.c_decls {
             let begin_loc: SrcLoc = decl.begin_loc().expect("no begin loc for top-level decl");
             let end_loc: SrcLoc = decl.end_loc().expect("no end loc for top-level decl");
+            if begin_loc.line > end_loc.line {
+                eprintln!("Skipping invalid source range for decl {decl_id:?}: begin {begin_loc:?} > end {end_loc:?}");
+                continue;
+            }

             // If encountering a new file, reset end of last top-level decl.
             if prev_src_loc.fileid != begin_loc.fileid {

@fw-immunant
Copy link
Contributor Author

fw-immunant commented Dec 5, 2025

I just pushed a couple commits that fix the comments test failure and the begin/end ordering issue; I had expected the initial traversal of decls was srcloc-sorted, but it wasn't. libxml2 seems to go through without issue now, but upon inspecting the JSON it looks like definitions are overlapping others; debugging now.

@thedataking
Copy link
Contributor

I think tip-of-tree lua uses GNU labels as values which makes the c2rust transpiler barf and in turn, some of the json files emitted are not valid json. Please consider interactions between unsuccessful transpiles and this feature.

@thedataking
Copy link
Contributor

thedataking commented Dec 6, 2025

We can't practically integrate the Rust splitting/merging tools into our Cargo.toml because they use Rust 2024 and have dependencies that also do. But I don't think it's a problem to just have them live in their own subdirectories for now; they just have to be built before the c2rust-postprocess tool can invoke them.

I think you have to add tools to the exclude key in the Cargo.toml workspace then.

(It would also be good to have a README in the tools dir showing how to build, run, and test the tools.)

@fw-immunant
Copy link
Contributor Author

fw-immunant commented Dec 6, 2025

I think tip-of-tree lua uses GNU labels as values which makes the c2rust transpiler barf and in turn, some of the json files emitted are not valid json. Please consider interactions between unsuccessful transpiles and this feature.

I wasn't able to reproduce this--which JSON output is malformed, and how? It's surprising that we would see invalid JSON, because we serialize it with serde_json, which makes me think that incorrect results based on the transpiler giving up should still be serialized properly, unless we somehow fail to flush a file buffer or something.

Any tips reproducing the failure? I tried:

git clone https://github.com/lua/lua
cd lua
bear -- make -j9
c2rust-transpile --overwrite-existing --emit-build-files --emit-c-decl-map compile_commands.json

From the transpiler, I got this output:

c2rust-transpile output
warning: unknown warning option '-Wlogical-op'; did you mean '-Wlong-long'? [-Wunknown-warning-option]
warning: unknown warning option '-Wno-aggressive-loop-optimizations' [-Wunknown-warning-option]
warning: unknown warning option '-Wlogical-op'; did you mean '-Wlong-long'? [-Wunknown-warning-option]
warning: unknown warning option '-Wno-aggressive-loop-optimizations' [-Wunknown-warning-option]
2 warnings generated.
Transpiling lfunc.c
warning: unknown warning option '-Wlogical-op'; did you mean '-Wlong-long'? [-Wunknown-warning-option]
warning: unknown warning option '-Wno-aggressive-loop-optimizations' [-Wunknown-warning-option]
warning: unknown warning option '-Wlogical-op'; did you mean '-Wlong-long'? [-Wunknown-warning-option]
warning: unknown warning option '-Wno-aggressive-loop-optimizations' [-Wunknown-warning-option]
2 warnings generated.
Transpiling lapi.c
failed to convert top-level decl CDeclId(3150)!
failed to convert top-level decl CDeclId(5838)!
failed to convert top-level decl CDeclId(5860)!
failed to convert top-level decl CDeclId(11744)!
warning: unknown warning option '-Wlogical-op'; did you mean '-Wlong-long'? [-Wunknown-warning-option]
warning: unknown warning option '-Wno-aggressive-loop-optimizations' [-Wunknown-warning-option]
warning: unknown warning option '-Wlogical-op'; did you mean '-Wlong-long'? [-Wunknown-warning-option]
warning: unknown warning option '-Wno-aggressive-loop-optimizations' [-Wunknown-warning-option]
2 warnings generated.
Transpiling lcode.c
failed to convert top-level decl CDeclId(16982)!
failed to convert top-level decl CDeclId(15919)!
failed to convert top-level decl CDeclId(15945)!
warning: unknown warning option '-Wlogical-op'; did you mean '-Wlong-long'? [-Wunknown-warning-option]
warning: unknown warning option '-Wno-aggressive-loop-optimizations' [-Wunknown-warning-option]
warning: unknown warning option '-Wlogical-op'; did you mean '-Wlong-long'? [-Wunknown-warning-option]
warning: unknown warning option '-Wno-aggressive-loop-optimizations' [-Wunknown-warning-option]
2 warnings generated.
Transpiling ldump.c
failed to convert top-level decl CDeclId(2433)!
failed to convert top-level decl CDeclId(2436)!
failed to convert top-level decl CDeclId(4326)!
warning: unknown warning option '-Wlogical-op'; did you mean '-Wlong-long'? [-Wunknown-warning-option]
warning: unknown warning option '-Wno-aggressive-loop-optimizations' [-Wunknown-warning-option]
warning: unknown warning option '-Wlogical-op'; did you mean '-Wlong-long'? [-Wunknown-warning-option]
warning: unknown warning option '-Wno-aggressive-loop-optimizations' [-Wunknown-warning-option]
2 warnings generated.
Transpiling lctype.c
warning: unknown warning option '-Wlogical-op'; did you mean '-Wlong-long'? [-Wunknown-warning-option]
warning: unknown warning option '-Wno-aggressive-loop-optimizations' [-Wunknown-warning-option]
warning: unknown warning option '-Wlogical-op'; did you mean '-Wlong-long'? [-Wunknown-warning-option]
warning: unknown warning option '-Wno-aggressive-loop-optimizations' [-Wunknown-warning-option]
2 warnings generated.
Transpiling ldo.c
failed to convert top-level decl CDeclId(6037)!
failed to convert top-level decl CDeclId(1316)!
failed to convert top-level decl CDeclId(11183)!
failed to convert top-level decl CDeclId(2276)!
failed to convert top-level decl CDeclId(9540)!
failed to convert top-level decl CDeclId(8985)!
warning: unknown warning option '-Wlogical-op'; did you mean '-Wlong-long'? [-Wunknown-warning-option]
warning: unknown warning option '-Wno-aggressive-loop-optimizations' [-Wunknown-warning-option]
warning: unknown warning option '-Wlogical-op'; did you mean '-Wlong-long'? [-Wunknown-warning-option]
warning: unknown warning option '-Wno-aggressive-loop-optimizations' [-Wunknown-warning-option]
2 warnings generated.
Transpiling llex.c
failed to convert top-level decl CDeclId(7903)!
warning: unknown warning option '-Wlogical-op'; did you mean '-Wlong-long'? [-Wunknown-warning-option]
warning: unknown warning option '-Wno-aggressive-loop-optimizations' [-Wunknown-warning-option]
warning: unknown warning option '-Wlogical-op'; did you mean '-Wlong-long'? [-Wunknown-warning-option]
warning: unknown warning option '-Wno-aggressive-loop-optimizations' [-Wunknown-warning-option]
2 warnings generated.
Transpiling ldebug.c
failed to convert top-level decl CDeclId(10309)!
failed to convert top-level decl CDeclId(8801)!
failed to convert top-level decl CDeclId(8825)!
warning: unknown warning option '-Wlogical-op'; did you mean '-Wlong-long'? [-Wunknown-warning-option]
warning: unknown warning option '-Wno-aggressive-loop-optimizations' [-Wunknown-warning-option]
warning: unknown warning option '-Wlogical-op'; did you mean '-Wlong-long'? [-Wunknown-warning-option]
warning: unknown warning option '-Wno-aggressive-loop-optimizations' [-Wunknown-warning-option]
2 warnings generated.
Transpiling lgc.c
failed to convert top-level decl CDeclId(14354)!
failed to convert top-level decl CDeclId(14359)!
failed to convert top-level decl CDeclId(14357)!
failed to convert top-level decl CDeclId(3525)!
warning: unknown warning option '-Wlogical-op'; did you mean '-Wlong-long'? [-Wunknown-warning-option]
warning: unknown warning option '-Wno-aggressive-loop-optimizations' [-Wunknown-warning-option]
warning: unknown warning option '-Wlogical-op'; did you mean '-Wlong-long'? [-Wunknown-warning-option]
warning: unknown warning option '-Wno-aggressive-loop-optimizations' [-Wunknown-warning-option]
2 warnings generated.
Transpiling lmem.c
failed to convert top-level decl CDeclId(1714)!
warning: unknown warning option '-Wlogical-op'; did you mean '-Wlong-long'? [-Wunknown-warning-option]
warning: unknown warning option '-Wno-aggressive-loop-optimizations' [-Wunknown-warning-option]
warning: unknown warning option '-Wlogical-op'; did you mean '-Wlong-long'? [-Wunknown-warning-option]
warning: unknown warning option '-Wno-aggressive-loop-optimizations' [-Wunknown-warning-option]
2 warnings generated.
Transpiling lobject.c
failed to convert top-level decl CDeclId(7397)!
failed to convert top-level decl CDeclId(7400)!
failed to convert top-level decl CDeclId(8667)!
failed to convert top-level decl CDeclId(8644)!
warning: unknown warning option '-Wlogical-op'; did you mean '-Wlong-long'? [-Wunknown-warning-option]
warning: unknown warning option '-Wno-aggressive-loop-optimizations' [-Wunknown-warning-option]
warning: unknown warning option '-Wlogical-op'; did you mean '-Wlong-long'? [-Wunknown-warning-option]
warning: unknown warning option '-Wno-aggressive-loop-optimizations' [-Wunknown-warning-option]
2 warnings generated.
Transpiling lopcodes.c
warning: unknown warning option '-Wlogical-op'; did you mean '-Wlong-long'? [-Wunknown-warning-option]
warning: unknown warning option '-Wno-aggressive-loop-optimizations' [-Wunknown-warning-option]
warning: unknown warning option '-Wlogical-op'; did you mean '-Wlong-long'? [-Wunknown-warning-option]
warning: unknown warning option '-Wno-aggressive-loop-optimizations' [-Wunknown-warning-option]
2 warnings generated.
Transpiling lparser.c
failed to convert top-level decl CDeclId(2639)!
failed to convert top-level decl CDeclId(1761)!
failed to convert top-level decl CDeclId(15285)!
failed to convert top-level decl CDeclId(15289)!
failed to convert top-level decl CDeclId(15288)!
failed to convert top-level decl CDeclId(2199)!
failed to convert top-level decl CDeclId(5290)!
failed to convert top-level decl CDeclId(5293)!
failed to convert top-level decl CDeclId(4631)!
failed to convert top-level decl CDeclId(4097)!
warning: unknown warning option '-Wlogical-op'; did you mean '-Wlong-long'? [-Wunknown-warning-option]
warning: unknown warning option '-Wno-aggressive-loop-optimizations' [-Wunknown-warning-option]
warning: unknown warning option '-Wlogical-op'; did you mean '-Wlong-long'? [-Wunknown-warning-option]
warning: unknown warning option '-Wno-aggressive-loop-optimizations' [-Wunknown-warning-option]
2 warnings generated.
Transpiling lstate.c
failed to convert top-level decl CDeclId(5057)!
warning: unknown warning option '-Wlogical-op'; did you mean '-Wlong-long'? [-Wunknown-warning-option]
warning: unknown warning option '-Wno-aggressive-loop-optimizations' [-Wunknown-warning-option]
warning: unknown warning option '-Wlogical-op'; did you mean '-Wlong-long'? [-Wunknown-warning-option]
warning: unknown warning option '-Wno-aggressive-loop-optimizations' [-Wunknown-warning-option]
2 warnings generated.
Transpiling lstring.c
failed to convert top-level decl CDeclId(3140)!
failed to convert top-level decl CDeclId(4696)!
warning: unknown warning option '-Wlogical-op'; did you mean '-Wlong-long'? [-Wunknown-warning-option]
warning: unknown warning option '-Wno-aggressive-loop-optimizations' [-Wunknown-warning-option]
warning: unknown warning option '-Wlogical-op'; did you mean '-Wlong-long'? [-Wunknown-warning-option]
warning: unknown warning option '-Wno-aggressive-loop-optimizations' [-Wunknown-warning-option]
2 warnings generated.
Transpiling ltable.c
failed to convert top-level decl CDeclId(12216)!
failed to convert top-level decl CDeclId(12213)!
failed to convert top-level decl CDeclId(6579)!
failed to convert top-level decl CDeclId(6582)!
failed to convert top-level decl CDeclId(8481)!
failed to convert top-level decl CDeclId(12225)!
failed to convert top-level decl CDeclId(12226)!
failed to convert top-level decl CDeclId(9402)!
failed to convert top-level decl CDeclId(9405)!
warning: unknown warning option '-Wlogical-op'; did you mean '-Wlong-long'? [-Wunknown-warning-option]
warning: unknown warning option '-Wno-aggressive-loop-optimizations' [-Wunknown-warning-option]
warning: unknown warning option '-Wlogical-op'; did you mean '-Wlong-long'? [-Wunknown-warning-option]
warning: unknown warning option '-Wno-aggressive-loop-optimizations' [-Wunknown-warning-option]
2 warnings generated.
Transpiling ltm.c
failed to convert top-level decl CDeclId(3212)!
warning: unknown warning option '-Wlogical-op'; did you mean '-Wlong-long'? [-Wunknown-warning-option]
warning: unknown warning option '-Wno-aggressive-loop-optimizations' [-Wunknown-warning-option]
warning: unknown warning option '-Wlogical-op'; did you mean '-Wlong-long'? [-Wunknown-warning-option]
warning: unknown warning option '-Wno-aggressive-loop-optimizations' [-Wunknown-warning-option]
In file included from lvm.c:1205:
./ljumptab.h:28:1: warning: c2rust: Cannot translate GNU address of label expression
   28 | &&L_OP_MOVE,
      | ^~
./ljumptab.h:29:1: warning: c2rust: Cannot translate GNU address of label expression
   29 | &&L_OP_LOADI,
      | ^~
./ljumptab.h:30:1: warning: c2rust: Cannot translate GNU address of label expression
   30 | &&L_OP_LOADF,
      | ^~
./ljumptab.h:31:1: warning: c2rust: Cannot translate GNU address of label expression
   31 | &&L_OP_LOADK,
      | ^~
./ljumptab.h:32:1: warning: c2rust: Cannot translate GNU address of label expression
   32 | &&L_OP_LOADKX,
      | ^~
./ljumptab.h:33:1: warning: c2rust: Cannot translate GNU address of label expression
   33 | &&L_OP_LOADFALSE,
      | ^~
./ljumptab.h:34:1: warning: c2rust: Cannot translate GNU address of label expression
   34 | &&L_OP_LFALSESKIP,
      | ^~
./ljumptab.h:35:1: warning: c2rust: Cannot translate GNU address of label expression
   35 | &&L_OP_LOADTRUE,
      | ^~
./ljumptab.h:36:1: warning: c2rust: Cannot translate GNU address of label expression
   36 | &&L_OP_LOADNIL,
      | ^~
./ljumptab.h:37:1: warning: c2rust: Cannot translate GNU address of label expression
   37 | &&L_OP_GETUPVAL,
      | ^~
./ljumptab.h:38:1: warning: c2rust: Cannot translate GNU address of label expression
   38 | &&L_OP_SETUPVAL,
      | ^~
./ljumptab.h:39:1: warning: c2rust: Cannot translate GNU address of label expression
   39 | &&L_OP_GETTABUP,
      | ^~
./ljumptab.h:40:1: warning: c2rust: Cannot translate GNU address of label expression
   40 | &&L_OP_GETTABLE,
      | ^~
./ljumptab.h:41:1: warning: c2rust: Cannot translate GNU address of label expression
   41 | &&L_OP_GETI,
      | ^~
./ljumptab.h:42:1: warning: c2rust: Cannot translate GNU address of label expression
   42 | &&L_OP_GETFIELD,
      | ^~
./ljumptab.h:43:1: warning: c2rust: Cannot translate GNU address of label expression
   43 | &&L_OP_SETTABUP,
      | ^~
./ljumptab.h:44:1: warning: c2rust: Cannot translate GNU address of label expression
   44 | &&L_OP_SETTABLE,
      | ^~
./ljumptab.h:45:1: warning: c2rust: Cannot translate GNU address of label expression
   45 | &&L_OP_SETI,
      | ^~
./ljumptab.h:46:1: warning: c2rust: Cannot translate GNU address of label expression
   46 | &&L_OP_SETFIELD,
      | ^~
./ljumptab.h:47:1: warning: c2rust: Cannot translate GNU address of label expression
   47 | &&L_OP_NEWTABLE,
      | ^~
./ljumptab.h:48:1: warning: c2rust: Cannot translate GNU address of label expression
   48 | &&L_OP_SELF,
      | ^~
./ljumptab.h:49:1: warning: c2rust: Cannot translate GNU address of label expression
   49 | &&L_OP_ADDI,
      | ^~
./ljumptab.h:50:1: warning: c2rust: Cannot translate GNU address of label expression
   50 | &&L_OP_ADDK,
      | ^~
./ljumptab.h:51:1: warning: c2rust: Cannot translate GNU address of label expression
   51 | &&L_OP_SUBK,
      | ^~
./ljumptab.h:52:1: warning: c2rust: Cannot translate GNU address of label expression
   52 | &&L_OP_MULK,
      | ^~
./ljumptab.h:53:1: warning: c2rust: Cannot translate GNU address of label expression
   53 | &&L_OP_MODK,
      | ^~
./ljumptab.h:54:1: warning: c2rust: Cannot translate GNU address of label expression
   54 | &&L_OP_POWK,
      | ^~
./ljumptab.h:55:1: warning: c2rust: Cannot translate GNU address of label expression
   55 | &&L_OP_DIVK,
      | ^~
./ljumptab.h:56:1: warning: c2rust: Cannot translate GNU address of label expression
   56 | &&L_OP_IDIVK,
      | ^~
./ljumptab.h:57:1: warning: c2rust: Cannot translate GNU address of label expression
   57 | &&L_OP_BANDK,
      | ^~
./ljumptab.h:58:1: warning: c2rust: Cannot translate GNU address of label expression
   58 | &&L_OP_BORK,
      | ^~
./ljumptab.h:59:1: warning: c2rust: Cannot translate GNU address of label expression
   59 | &&L_OP_BXORK,
      | ^~
./ljumptab.h:60:1: warning: c2rust: Cannot translate GNU address of label expression
   60 | &&L_OP_SHLI,
      | ^~
./ljumptab.h:61:1: warning: c2rust: Cannot translate GNU address of label expression
   61 | &&L_OP_SHRI,
      | ^~
./ljumptab.h:62:1: warning: c2rust: Cannot translate GNU address of label expression
   62 | &&L_OP_ADD,
      | ^~
./ljumptab.h:63:1: warning: c2rust: Cannot translate GNU address of label expression
   63 | &&L_OP_SUB,
      | ^~
./ljumptab.h:64:1: warning: c2rust: Cannot translate GNU address of label expression
   64 | &&L_OP_MUL,
      | ^~
./ljumptab.h:65:1: warning: c2rust: Cannot translate GNU address of label expression
   65 | &&L_OP_MOD,
      | ^~
./ljumptab.h:66:1: warning: c2rust: Cannot translate GNU address of label expression
   66 | &&L_OP_POW,
      | ^~
./ljumptab.h:67:1: warning: c2rust: Cannot translate GNU address of label expression
   67 | &&L_OP_DIV,
      | ^~
./ljumptab.h:68:1: warning: c2rust: Cannot translate GNU address of label expression
   68 | &&L_OP_IDIV,
      | ^~
./ljumptab.h:69:1: warning: c2rust: Cannot translate GNU address of label expression
   69 | &&L_OP_BAND,
      | ^~
./ljumptab.h:70:1: warning: c2rust: Cannot translate GNU address of label expression
   70 | &&L_OP_BOR,
      | ^~
./ljumptab.h:71:1: warning: c2rust: Cannot translate GNU address of label expression
   71 | &&L_OP_BXOR,
      | ^~
./ljumptab.h:72:1: warning: c2rust: Cannot translate GNU address of label expression
   72 | &&L_OP_SHL,
      | ^~
./ljumptab.h:73:1: warning: c2rust: Cannot translate GNU address of label expression
   73 | &&L_OP_SHR,
      | ^~
./ljumptab.h:74:1: warning: c2rust: Cannot translate GNU address of label expression
   74 | &&L_OP_MMBIN,
      | ^~
./ljumptab.h:75:1: warning: c2rust: Cannot translate GNU address of label expression
   75 | &&L_OP_MMBINI,
      | ^~
./ljumptab.h:76:1: warning: c2rust: Cannot translate GNU address of label expression
   76 | &&L_OP_MMBINK,
      | ^~
./ljumptab.h:77:1: warning: c2rust: Cannot translate GNU address of label expression
   77 | &&L_OP_UNM,
      | ^~
./ljumptab.h:78:1: warning: c2rust: Cannot translate GNU address of label expression
   78 | &&L_OP_BNOT,
      | ^~
./ljumptab.h:79:1: warning: c2rust: Cannot translate GNU address of label expression
   79 | &&L_OP_NOT,
      | ^~
./ljumptab.h:80:1: warning: c2rust: Cannot translate GNU address of label expression
   80 | &&L_OP_LEN,
      | ^~
./ljumptab.h:81:1: warning: c2rust: Cannot translate GNU address of label expression
   81 | &&L_OP_CONCAT,
      | ^~
./ljumptab.h:82:1: warning: c2rust: Cannot translate GNU address of label expression
   82 | &&L_OP_CLOSE,
      | ^~
./ljumptab.h:83:1: warning: c2rust: Cannot translate GNU address of label expression
   83 | &&L_OP_TBC,
      | ^~
./ljumptab.h:84:1: warning: c2rust: Cannot translate GNU address of label expression
   84 | &&L_OP_JMP,
      | ^~
./ljumptab.h:85:1: warning: c2rust: Cannot translate GNU address of label expression
   85 | &&L_OP_EQ,
      | ^~
./ljumptab.h:86:1: warning: c2rust: Cannot translate GNU address of label expression
   86 | &&L_OP_LT,
      | ^~
./ljumptab.h:87:1: warning: c2rust: Cannot translate GNU address of label expression
   87 | &&L_OP_LE,
      | ^~
./ljumptab.h:88:1: warning: c2rust: Cannot translate GNU address of label expression
   88 | &&L_OP_EQK,
      | ^~
./ljumptab.h:89:1: warning: c2rust: Cannot translate GNU address of label expression
   89 | &&L_OP_EQI,
      | ^~
./ljumptab.h:90:1: warning: c2rust: Cannot translate GNU address of label expression
   90 | &&L_OP_LTI,
      | ^~
./ljumptab.h:91:1: warning: c2rust: Cannot translate GNU address of label expression
   91 | &&L_OP_LEI,
      | ^~
./ljumptab.h:92:1: warning: c2rust: Cannot translate GNU address of label expression
   92 | &&L_OP_GTI,
      | ^~
./ljumptab.h:93:1: warning: c2rust: Cannot translate GNU address of label expression
   93 | &&L_OP_GEI,
      | ^~
./ljumptab.h:94:1: warning: c2rust: Cannot translate GNU address of label expression
   94 | &&L_OP_TEST,
      | ^~
./ljumptab.h:95:1: warning: c2rust: Cannot translate GNU address of label expression
   95 | &&L_OP_TESTSET,
      | ^~
./ljumptab.h:96:1: warning: c2rust: Cannot translate GNU address of label expression
   96 | &&L_OP_CALL,
      | ^~
./ljumptab.h:97:1: warning: c2rust: Cannot translate GNU address of label expression
   97 | &&L_OP_TAILCALL,
      | ^~
./ljumptab.h:98:1: warning: c2rust: Cannot translate GNU address of label expression
   98 | &&L_OP_RETURN,
      | ^~
./ljumptab.h:99:1: warning: c2rust: Cannot translate GNU address of label expression
   99 | &&L_OP_RETURN0,
      | ^~
./ljumptab.h:100:1: warning: c2rust: Cannot translate GNU address of label expression
  100 | &&L_OP_RETURN1,
      | ^~
./ljumptab.h:101:1: warning: c2rust: Cannot translate GNU address of label expression
  101 | &&L_OP_FORLOOP,
      | ^~
./ljumptab.h:102:1: warning: c2rust: Cannot translate GNU address of label expression
  102 | &&L_OP_FORPREP,
      | ^~
./ljumptab.h:103:1: warning: c2rust: Cannot translate GNU address of label expression
  103 | &&L_OP_TFORPREP,
      | ^~
./ljumptab.h:104:1: warning: c2rust: Cannot translate GNU address of label expression
  104 | &&L_OP_TFORCALL,
      | ^~
./ljumptab.h:105:1: warning: c2rust: Cannot translate GNU address of label expression
  105 | &&L_OP_TFORLOOP,
      | ^~
./ljumptab.h:106:1: warning: c2rust: Cannot translate GNU address of label expression
  106 | &&L_OP_SETLIST,
      | ^~
./ljumptab.h:107:1: warning: c2rust: Cannot translate GNU address of label expression
  107 | &&L_OP_CLOSURE,
      | ^~
./ljumptab.h:108:1: warning: c2rust: Cannot translate GNU address of label expression
  108 | &&L_OP_VARARG,
      | ^~
./ljumptab.h:109:1: warning: c2rust: Cannot translate GNU address of label expression
  109 | &&L_OP_GETVARG,
      | ^~
./ljumptab.h:110:1: warning: c2rust: Cannot translate GNU address of label expression
  110 | &&L_OP_ERRNNIL,
      | ^~
./ljumptab.h:111:1: warning: c2rust: Cannot translate GNU address of label expression
  111 | &&L_OP_VARARGPREP,
      | ^~
./ljumptab.h:112:1: warning: c2rust: Cannot translate GNU address of label expression
  112 | &&L_OP_EXTRAARG
      | ^~
./ljumptab.h:28:1: warning: c2rust: Cannot translate GNU address of label expression
   28 | &&L_OP_MOVE,
      | ^~
./ljumptab.h:29:1: warning: c2rust: Cannot translate GNU address of label expression
   29 | &&L_OP_LOADI,
      | ^~
./ljumptab.h:30:1: warning: c2rust: Cannot translate GNU address of label expression
   30 | &&L_OP_LOADF,
      | ^~
./ljumptab.h:31:1: warning: c2rust: Cannot translate GNU address of label expression
   31 | &&L_OP_LOADK,
      | ^~
./ljumptab.h:32:1: warning: c2rust: Cannot translate GNU address of label expression
   32 | &&L_OP_LOADKX,
      | ^~
./ljumptab.h:33:1: warning: c2rust: Cannot translate GNU address of label expression
   33 | &&L_OP_LOADFALSE,
      | ^~
./ljumptab.h:34:1: warning: c2rust: Cannot translate GNU address of label expression
   34 | &&L_OP_LFALSESKIP,
      | ^~
./ljumptab.h:35:1: warning: c2rust: Cannot translate GNU address of label expression
   35 | &&L_OP_LOADTRUE,
      | ^~
./ljumptab.h:36:1: warning: c2rust: Cannot translate GNU address of label expression
   36 | &&L_OP_LOADNIL,
      | ^~
./ljumptab.h:37:1: warning: c2rust: Cannot translate GNU address of label expression
   37 | &&L_OP_GETUPVAL,
      | ^~
./ljumptab.h:38:1: warning: c2rust: Cannot translate GNU address of label expression
   38 | &&L_OP_SETUPVAL,
      | ^~
./ljumptab.h:39:1: warning: c2rust: Cannot translate GNU address of label expression
   39 | &&L_OP_GETTABUP,
      | ^~
./ljumptab.h:40:1: warning: c2rust: Cannot translate GNU address of label expression
   40 | &&L_OP_GETTABLE,
      | ^~
./ljumptab.h:41:1: warning: c2rust: Cannot translate GNU address of label expression
   41 | &&L_OP_GETI,
      | ^~
./ljumptab.h:42:1: warning: c2rust: Cannot translate GNU address of label expression
   42 | &&L_OP_GETFIELD,
      | ^~
./ljumptab.h:43:1: warning: c2rust: Cannot translate GNU address of label expression
   43 | &&L_OP_SETTABUP,
      | ^~
./ljumptab.h:44:1: warning: c2rust: Cannot translate GNU address of label expression
   44 | &&L_OP_SETTABLE,
      | ^~
./ljumptab.h:45:1: warning: c2rust: Cannot translate GNU address of label expression
   45 | &&L_OP_SETI,
      | ^~
./ljumptab.h:46:1: warning: c2rust: Cannot translate GNU address of label expression
   46 | &&L_OP_SETFIELD,
      | ^~
./ljumptab.h:47:1: warning: c2rust: Cannot translate GNU address of label expression
   47 | &&L_OP_NEWTABLE,
      | ^~
./ljumptab.h:48:1: warning: c2rust: Cannot translate GNU address of label expression
   48 | &&L_OP_SELF,
      | ^~
./ljumptab.h:49:1: warning: c2rust: Cannot translate GNU address of label expression
   49 | &&L_OP_ADDI,
      | ^~
./ljumptab.h:50:1: warning: c2rust: Cannot translate GNU address of label expression
   50 | &&L_OP_ADDK,
      | ^~
./ljumptab.h:51:1: warning: c2rust: Cannot translate GNU address of label expression
   51 | &&L_OP_SUBK,
      | ^~
./ljumptab.h:52:1: warning: c2rust: Cannot translate GNU address of label expression
   52 | &&L_OP_MULK,
      | ^~
./ljumptab.h:53:1: warning: c2rust: Cannot translate GNU address of label expression
   53 | &&L_OP_MODK,
      | ^~
./ljumptab.h:54:1: warning: c2rust: Cannot translate GNU address of label expression
   54 | &&L_OP_POWK,
      | ^~
./ljumptab.h:55:1: warning: c2rust: Cannot translate GNU address of label expression
   55 | &&L_OP_DIVK,
      | ^~
./ljumptab.h:56:1: warning: c2rust: Cannot translate GNU address of label expression
   56 | &&L_OP_IDIVK,
      | ^~
./ljumptab.h:57:1: warning: c2rust: Cannot translate GNU address of label expression
   57 | &&L_OP_BANDK,
      | ^~
./ljumptab.h:58:1: warning: c2rust: Cannot translate GNU address of label expression
   58 | &&L_OP_BORK,
      | ^~
./ljumptab.h:59:1: warning: c2rust: Cannot translate GNU address of label expression
   59 | &&L_OP_BXORK,
      | ^~
./ljumptab.h:60:1: warning: c2rust: Cannot translate GNU address of label expression
   60 | &&L_OP_SHLI,
      | ^~
./ljumptab.h:61:1: warning: c2rust: Cannot translate GNU address of label expression
   61 | &&L_OP_SHRI,
      | ^~
./ljumptab.h:62:1: warning: c2rust: Cannot translate GNU address of label expression
   62 | &&L_OP_ADD,
      | ^~
./ljumptab.h:63:1: warning: c2rust: Cannot translate GNU address of label expression
   63 | &&L_OP_SUB,
      | ^~
./ljumptab.h:64:1: warning: c2rust: Cannot translate GNU address of label expression
   64 | &&L_OP_MUL,
      | ^~
./ljumptab.h:65:1: warning: c2rust: Cannot translate GNU address of label expression
   65 | &&L_OP_MOD,
      | ^~
./ljumptab.h:66:1: warning: c2rust: Cannot translate GNU address of label expression
   66 | &&L_OP_POW,
      | ^~
./ljumptab.h:67:1: warning: c2rust: Cannot translate GNU address of label expression
   67 | &&L_OP_DIV,
      | ^~
./ljumptab.h:68:1: warning: c2rust: Cannot translate GNU address of label expression
   68 | &&L_OP_IDIV,
      | ^~
./ljumptab.h:69:1: warning: c2rust: Cannot translate GNU address of label expression
   69 | &&L_OP_BAND,
      | ^~
./ljumptab.h:70:1: warning: c2rust: Cannot translate GNU address of label expression
   70 | &&L_OP_BOR,
      | ^~
./ljumptab.h:71:1: warning: c2rust: Cannot translate GNU address of label expression
   71 | &&L_OP_BXOR,
      | ^~
./ljumptab.h:72:1: warning: c2rust: Cannot translate GNU address of label expression
   72 | &&L_OP_SHL,
      | ^~
./ljumptab.h:73:1: warning: c2rust: Cannot translate GNU address of label expression
   73 | &&L_OP_SHR,
      | ^~
./ljumptab.h:74:1: warning: c2rust: Cannot translate GNU address of label expression
   74 | &&L_OP_MMBIN,
      | ^~
./ljumptab.h:75:1: warning: c2rust: Cannot translate GNU address of label expression
   75 | &&L_OP_MMBINI,
      | ^~
./ljumptab.h:76:1: warning: c2rust: Cannot translate GNU address of label expression
   76 | &&L_OP_MMBINK,
      | ^~
./ljumptab.h:77:1: warning: c2rust: Cannot translate GNU address of label expression
   77 | &&L_OP_UNM,
      | ^~
./ljumptab.h:78:1: warning: c2rust: Cannot translate GNU address of label expression
   78 | &&L_OP_BNOT,
      | ^~
./ljumptab.h:79:1: warning: c2rust: Cannot translate GNU address of label expression
   79 | &&L_OP_NOT,
      | ^~
./ljumptab.h:80:1: warning: c2rust: Cannot translate GNU address of label expression
   80 | &&L_OP_LEN,
      | ^~
./ljumptab.h:81:1: warning: c2rust: Cannot translate GNU address of label expression
   81 | &&L_OP_CONCAT,
      | ^~
./ljumptab.h:82:1: warning: c2rust: Cannot translate GNU address of label expression
   82 | &&L_OP_CLOSE,
      | ^~
./ljumptab.h:83:1: warning: c2rust: Cannot translate GNU address of label expression
   83 | &&L_OP_TBC,
      | ^~
./ljumptab.h:84:1: warning: c2rust: Cannot translate GNU address of label expression
   84 | &&L_OP_JMP,
      | ^~
./ljumptab.h:85:1: warning: c2rust: Cannot translate GNU address of label expression
   85 | &&L_OP_EQ,
      | ^~
./ljumptab.h:86:1: warning: c2rust: Cannot translate GNU address of label expression
   86 | &&L_OP_LT,
      | ^~
./ljumptab.h:87:1: warning: c2rust: Cannot translate GNU address of label expression
   87 | &&L_OP_LE,
      | ^~
./ljumptab.h:88:1: warning: c2rust: Cannot translate GNU address of label expression
   88 | &&L_OP_EQK,
      | ^~
./ljumptab.h:89:1: warning: c2rust: Cannot translate GNU address of label expression
   89 | &&L_OP_EQI,
      | ^~
./ljumptab.h:90:1: warning: c2rust: Cannot translate GNU address of label expression
   90 | &&L_OP_LTI,
      | ^~
./ljumptab.h:91:1: warning: c2rust: Cannot translate GNU address of label expression
   91 | &&L_OP_LEI,
      | ^~
./ljumptab.h:92:1: warning: c2rust: Cannot translate GNU address of label expression
   92 | &&L_OP_GTI,
      | ^~
./ljumptab.h:93:1: warning: c2rust: Cannot translate GNU address of label expression
   93 | &&L_OP_GEI,
      | ^~
./ljumptab.h:94:1: warning: c2rust: Cannot translate GNU address of label expression
   94 | &&L_OP_TEST,
      | ^~
./ljumptab.h:95:1: warning: c2rust: Cannot translate GNU address of label expression
   95 | &&L_OP_TESTSET,
      | ^~
./ljumptab.h:96:1: warning: c2rust: Cannot translate GNU address of label expression
   96 | &&L_OP_CALL,
      | ^~
./ljumptab.h:97:1: warning: c2rust: Cannot translate GNU address of label expression
   97 | &&L_OP_TAILCALL,
      | ^~
./ljumptab.h:98:1: warning: c2rust: Cannot translate GNU address of label expression
   98 | &&L_OP_RETURN,
      | ^~
./ljumptab.h:99:1: warning: c2rust: Cannot translate GNU address of label expression
   99 | &&L_OP_RETURN0,
      | ^~
./ljumptab.h:100:1: warning: c2rust: Cannot translate GNU address of label expression
  100 | &&L_OP_RETURN1,
      | ^~
./ljumptab.h:101:1: warning: c2rust: Cannot translate GNU address of label expression
  101 | &&L_OP_FORLOOP,
      | ^~
./ljumptab.h:102:1: warning: c2rust: Cannot translate GNU address of label expression
  102 | &&L_OP_FORPREP,
      | ^~
./ljumptab.h:103:1: warning: c2rust: Cannot translate GNU address of label expression
  103 | &&L_OP_TFORPREP,
      | ^~
./ljumptab.h:104:1: warning: c2rust: Cannot translate GNU address of label expression
  104 | &&L_OP_TFORCALL,
      | ^~
./ljumptab.h:105:1: warning: c2rust: Cannot translate GNU address of label expression
  105 | &&L_OP_TFORLOOP,
      | ^~
./ljumptab.h:106:1: warning: c2rust: Cannot translate GNU address of label expression
  106 | &&L_OP_SETLIST,
      | ^~
./ljumptab.h:107:1: warning: c2rust: Cannot translate GNU address of label expression
  107 | &&L_OP_CLOSURE,
      | ^~
./ljumptab.h:108:1: warning: c2rust: Cannot translate GNU address of label expression
  108 | &&L_OP_VARARG,
      | ^~
./ljumptab.h:109:1: warning: c2rust: Cannot translate GNU address of label expression
  109 | &&L_OP_GETVARG,
      | ^~
./ljumptab.h:110:1: warning: c2rust: Cannot translate GNU address of label expression
  110 | &&L_OP_ERRNNIL,
      | ^~
./ljumptab.h:111:1: warning: c2rust: Cannot translate GNU address of label expression
  111 | &&L_OP_VARARGPREP,
      | ^~
./ljumptab.h:112:1: warning: c2rust: Cannot translate GNU address of label expression
  112 | &&L_OP_EXTRAARG
      | ^~
lvm.c:1232:5: error: c2rust: the GNU C labels-as-values extension is not supported. Aborting.
 1232 |     vmdispatch (GET_OPCODE(i)) {
      |     ^~~~~~~~~~~~~~~~~~~~~~~~~~
./ljumptab.h:12:27: note: expanded from macro 'vmdispatch'
   12 | #define vmdispatch(x)     goto *disptab[x];
      |                           ^~~~~~~~~~~~~~~
zsh: IOT instruction (core dumped)  c2rust-transpile --overwrite-existing   

but all the .c_decls.json files seem valid (according to the json_verify tool from yajl):

for i in *.c_decls.json; do echo -n "$i: "; json_verify < $i; done
lapi.c_decls.json: JSON is valid
lcode.c_decls.json: JSON is valid
lctype.c_decls.json: JSON is valid
ldebug.c_decls.json: JSON is valid
ldo.c_decls.json: JSON is valid
ldump.c_decls.json: JSON is valid
lfunc.c_decls.json: JSON is valid
lgc.c_decls.json: JSON is valid
llex.c_decls.json: JSON is valid
lmem.c_decls.json: JSON is valid
lobject.c_decls.json: JSON is valid
lopcodes.c_decls.json: JSON is valid
lparser.c_decls.json: JSON is valid
lstate.c_decls.json: JSON is valid
lstring.c_decls.json: JSON is valid
ltable.c_decls.json: JSON is valid
ltm.c_decls.json: JSON is valid

@fw-immunant fw-immunant force-pushed the fw/c2rust-postprocess-split-merge branch 2 times, most recently from 485b683 to 6f349c1 Compare December 6, 2025 15:47
@fw-immunant
Copy link
Contributor Author

We now seem to extract correct source ranges for all the definitions in libxml2 and lua in the testsuite.

@fw-immunant fw-immunant force-pushed the fw/c2rust-postprocess-split-merge branch from c56e84d to 9d65be7 Compare December 7, 2025 21:04

## Building

These tools rely on rust-analyzer's libraries, so they need a relatively recent version of Rust. Run `cargo build --release` in their respective directories to build them.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For me, that picks up the c2rust nightly toolchain and fails because it is too old. I had to run cargo +stable build --release.

Copy link
Contributor Author

@fw-immunant fw-immunant Dec 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, that's because I was running actual cargo, not rustup aliased to cargo. In the latter case (which is probably more common) your command is the right one; I'll clarify. Or alternately, rustup might pick up a rust-toolchain.toml if we placed one somewhere.

Copy link
Contributor

@kkysen kkysen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do I test this? Also, do we need to merge the --emit-c-decl-map changes to c2rust-transpile/c2rust-ast-exporter at the same time as the new tools/? The former looks mostly good to me, but I haven't reviewed the latter. Also, is it worth it to review split_ffi_entry_points, as FWIU, it's already in crisp?

Comment on lines 313 to 317
decls_sorted.sort_by(|v1, v2| {
self.c_decls[v1]
.begin_loc()
.cmp(&self.c_decls[v2].begin_loc())
});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
decls_sorted.sort_by(|v1, v2| {
self.c_decls[v1]
.begin_loc()
.cmp(&self.c_decls[v2].begin_loc())
});
decls_sorted.sort_by_key(|v| self.c_decls[v].begin_loc());

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you also move the sorting to before name_loc_map and prev_src_loc, which are only used in the loop?

),
};

match serde_json::ser::to_writer(file, &decl_map) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's probably best to just write to a Vec<u8>/String and then write the file all at once. You could also write to a buffered writer, but that's easy to forget, like here.

Comment on lines +620 to +622
"Unable to write C declaration map to file {}: {}",
output_path.display(),
e
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"Unable to write C declaration map to file {}: {}",
output_path.display(),
e
"Unable to write C declaration map to file {}: {e}",
output_path.display(),

}

let file_content =
std::fs::read(&t.ast_context.get_file_path(t.main_file).unwrap()).unwrap();
Copy link
Contributor

@kkysen kkysen Dec 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we use a .expect() here or fs_err? A file missing is a fairly common error, so better context is important.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably a good idea, will do.

let mut begin_offset = src_loc_to_byte_offset(&line_end_offsets, begin);
let mut end_offset = src_loc_to_byte_offset(&line_end_offsets, end);
assert!(begin_offset <= end_offset);
const VT: u8 = 11;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's VT?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Vertical tab?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, ASCII VT, which has gone out of fashion so Rust lacks an escape for it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
const VT: u8 = 11;
const VT: u8 = 11; // Vertical Tab

})
.collect::<HashMap<_, _>>();

// Generate a map from Rust items to the source code of their C declarations.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we put this in its own function? I think it'd be easier to follow that way.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, an emit_c_decl_map function would probably be good to avoid bloating fn translate.

.collect::<HashMap<_, _>>();

// Generate a map from Rust items to the source code of their C declarations.
let decl_map = if let Some(decl_source_ranges) = decl_source_ranges {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

decl_source_ranges.map(|decl_sources_ranges| {})?

assert!(begin_offset <= end_offset);
const VT: u8 = 11;
/* Skip whitespace and any trailing semicolons after the previous decl. */
while let Some(b'\t' | b'\n' | &VT | b'\r' | b' ' | b';') =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you use u8::is_ascii_whitespace + ;?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't include \v, but this could be expressed using that function if we wanted. I don't know if it's clearer, given the specific set of characters we care about.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. is_ascii_whitespace also handles FF but not VT. Is VT worth handling at all? I guess it doesn't matter that much. I don't imagine anything will actually produce VTs or FFs.

@fw-immunant
Copy link
Contributor Author

How do I test this? Also, do we need to merge the --emit-c-decl-map changes to c2rust-transpile/c2rust-ast-exporter at the same time as the new tools/? The former looks mostly good to me, but I haven't reviewed the latter. Also, is it worth it to review split_ffi_entry_points, as FWIU, it's already in crisp?

The flag is independent from tools but intended to be used alongside them in the CRISP loop. This PR doesn't include any automated testing yet, but I've been manually testing just by invoking rustc with the flag and examining the produced .c_decls.json files.

In addition to the testsuite, I've been using the following test case which I'll probably add as a snapshot test:

#include <stddef.h>
#include <stdint.h>
int a;int bb;/*comment for cc*/int ccc;
/*
123456789012345678901234567890
*/
typedef uintptr_t ngx_uint_t;
void f(void){}void g(void){}/*comment for h*/void h(void){}int d;int e;;;;;;;;int another;
//comment for ngx_hash_init
void
ngx_hash_init(void)
{
    size_t len;
    ngx_uint_t align = len & ~sizeof(void *);
}
int
anotherfunc(void)
{
    int var = 0;
    return 4 + var;
}
int
funcholdingdefine(void)
{
#define FOO 4000+50
    return FOO;
}
void forwarddeclbeforeotherdef(int x);
void funcafterdecl(int n) {}
/*real defn*/
void forwarddeclbeforeotherdef(int x) { }

@fw-immunant fw-immunant mentioned this pull request Dec 9, 2025
@fw-immunant fw-immunant force-pushed the fw/c2rust-postprocess-split-merge branch from 34658b1 to d08939a Compare December 12, 2025 17:27
this is being used for local definitions instead of the `static` keyword, so it should encompass the newer helpers we've defined recently
Clang's SourceRange ends at the beginning of the final token, but this is not useful if we want to look at every character of a declaration, as we may when considering the spelling of the corresponding C definition
otherwise, we treat them as one character, which means the next definition's source range will expand to include them as if they were leading comments for it
this can occur if a macro is defined inside a top-level function
otherwise, inner decls like local variables confuse our tracking of definition bounds
@fw-immunant fw-immunant force-pushed the fw/c2rust-postprocess-split-merge branch from d08939a to 018a315 Compare December 12, 2025 17:28
@fw-immunant
Copy link
Contributor Author

fw-immunant commented Dec 15, 2025

Should be good to go as long as CI passes. Still needs automated test coverage but that can be a follow-up.

this does seem to be benign in practice, so for now only emit it at debug logging level
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants