Skip to content

Conversation

MathiasVP
Copy link
Contributor

This PR switches C/C++ away from our own guards library and over to the shared guards library which was introduced in #19573.

The main motivation for this is two-fold: It reduces the amount of C/C++ specific QL code that needs to be maintained 1. It also enables C/C++ to instantiate the newly shared "nullness" library which was introduced in #20367. This should hopefully open the doors to better null dereference / use-after-free / double free queries.

On top of that, the improvements to the internal guards logic also means much better query results on existing queries. In particular, the cpp/missing-check-scanf query has lost a tons of FPs 🎉.

Commit-by-commit review extremely recommended. It's a large PR, but I actually think I managed to make it fairly reviewable 🤞.

This PR does bring a syntactic breaking changes which was just decided a couple of weeks ago would be okay for CodeQL going forward. The breaking change is that the AST wrapper around the IR-based guards library now extends Element instead of Expr. This is because the shared library also infers that Parameters (which are not Exprs in the AST) can be guards if the parameter determines a condition.

In fact, whereas the old library only made the condition (and certain subexpressions inside the condition) were Guards, the new library makes every expression a Guard, but only some of those Guards actually controls conditions. This is a fundamental difference that's always been between the C/C++ guards library and the Java/C# guards library, and we're finally getting rid of that difference which brings some more language consistency 🎉.

Footnotes

  1. Although, there is still lots of maintenance required because C/C++ still relies on our own implication of ensuresEq and ensuresLt. Getting rid of those requires switching C/C++ fully over to the shared range analysis library.

@MathiasVP MathiasVP requested a review from a team as a code owner September 18, 2025 09:23
@MathiasVP MathiasVP requested review from Copilot and removed request for a team September 18, 2025 09:23
@github-actions github-actions bot added the C++ label Sep 18, 2025
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR switches the C++ CodeQL library from its own guards library to the shared guards library that was introduced previously. The main purpose is to reduce C++-specific QL code maintenance and enable C++ to use the shared "nullness" library. This change also improves query results on existing queries, particularly reducing false positives in the cpp/missing-check-scanf query.

Key changes include:

  • Replacing the old C++-specific guards implementation with the shared guards library
  • Updating guard value types from AbstractValue to GuardValue
  • Making guards extend Element instead of Expr to support parameters as guards
  • Improving the guards logic to provide better query results

Reviewed Changes

Copilot reviewed 18 out of 18 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
GuardsEnsure.expected Updated test expectations showing improved guard inference with more comprehensive results
GuardsControl.ql Changed parameter type from AbstractValue to GuardValue
GuardsControl.expected Expanded test results with additional guard control relationships
GuardsCompare.ql Updated parameter type from AbstractValue to GuardValue
GuardsCompare.expected Extended comparison results with more comprehensive guard comparisons
Guards.ql Removed simple guard selection query
Guards.expected Removed corresponding test expectations
tests.ql Updated parameter types to use GuardValue
TOCTOUFilesystemRace.ql Fixed guard child access by casting to Expr
SSLResultConflation.ql Fixed guard child access by casting to Expr
SemanticExprSpecific.qll Updated method call from controlsEdge to controlsBranchEdge
Instruction.qll Added getAnInput() method to BinaryInstruction
EdgeKind.qll Enhanced switch edge handling with better value range support
SsaImpl.qll Major updates to SSA implementation with improved guard integration
IRGuards.qll Complete rewrite using shared guards library with expanded functionality
RangeAnalysis.qll Updated method call to use controlsBranchEdge

}

private import semmle.code.cpp.dataflow.new.DataFlow::DataFlow as DataFlow
private import semmle.code.cpp.ir.dataflow.internal.DataFlowPrivate as Private

Check warning

Code scanning / CodeQL

Names only differing by case Warning

Private is only different by casing from private that is used elsewhere for modules.
@MathiasVP MathiasVP marked this pull request as draft September 18, 2025 09:52
@MathiasVP MathiasVP force-pushed the use-shared-guards-library branch from c99ab52 to 6fe3e83 Compare September 18, 2025 10:21
@aschackmull
Copy link
Contributor

This is because the shared library also infers that Parameters (which are not Exprs in the AST) can be guards if the parameter determines a condition

No. Parameters can't be Guards. But case statements can - and that's what brings Guards beyond mere Exprs.

@MathiasVP
Copy link
Contributor Author

No. Parameters can't be Guards. But case statements can - and that's what brings Guards beyond mere Exprs.

Oh. I guess there's a subtle thing for C/C++ here then: In the IR world a parameter is an Instruction (specifically, it's an InitializeParameter instruction). And since Instructions are Exprs in the new guards library this means the new guards library can infer that InitializeParameter can be a guard ... and the InitializeParameter maps back to the AST-world as a Parameter.

Is that ... bad?

@MathiasVP
Copy link
Contributor Author

Phew! Thanks for sanity checking that! I'll pull this PR out of draft in a second once CI has finished the last check. I chatted with @jketema earlier and he's hoping that you can chime in on the reviewing on this PR. I guess primarily 840097f. Would you mind doing that at some point? 🙏

@aschackmull
Copy link
Contributor

Would you mind doing that at some point? 🙏

Sure, I'll take a look (not today, though).

@MathiasVP MathiasVP marked this pull request as ready for review September 18, 2025 11:59

private module GuardsImpl = SharedGuards::Make<Cpp::Location, IRCfg, GuardsInput>;

private module LogicInput_v1 implements GuardsImpl::LogicInputSig {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we only have _v1 then the suffix seems somewhat superfluous, but I assume you perhaps intend to add a _v2, which adds the output from Range Analysis? I.e. let's keep the _v1 suffix.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, eventually I will add a _2 😄 However, I wanted to keep things as semantically identical as I could for this first PR.

@aschackmull
Copy link
Contributor

Could you perhaps add a new test with a bunch of complex guard implications to showcase that these are indeed recognized? E.g. something like this and a bunch of other cases:

x = notNull(y) ? 42 : 0;
if (x != 0) {
  // guarded by `y != null`
}

And a test case for data flow that shows that barrier guard wrappers are now supported.

@MathiasVP MathiasVP force-pushed the use-shared-guards-library branch from dad3413 to b2269fb Compare September 24, 2025 18:06
@MathiasVP MathiasVP force-pushed the use-shared-guards-library branch from 74799e5 to a07d03f Compare September 24, 2025 19:10
@MathiasVP
Copy link
Contributor Author

Could you perhaps add a new test with a bunch of complex guard implications to showcase that these are indeed recognized? E.g. something like this and a bunch of other cases:

x = notNull(y) ? 42 : 0;
if (x != 0) {
  // guarded by `y != null`
}

And a test case for data flow that shows that barrier guard wrappers are now supported.

Thanks a lot for reminding me to add those tests! I've added a bunch in b2269fb. That commit also highlights some missing inferences, and 26a8a4b also demonstrated that something was totally broken with the guard wrapper implementation.

Luckily, the fix was easy: a07d03f. With that change the missing inferences are now fixed, and the barrier guard wrappers also work 🎉.

I'll start another DCA to see if this has any effects on real-world projects.

void test_ternary(void* y) {
int x = testNotNull1(y) ? 42 : 0;
if (x != 0) {
chk(); // $ guarded='... ? ... : ...:not 0' guarded='42:not 0' guarded='call to testNotNull1:true' guarded='x != 0:true' guarded='x:not 0' guarded='y:not null'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, we shouldn't see guarded='42:not 0' as that's a trivial tautology. But I think perhaps the fix for that is tucked away in one of my local branches...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I was wondering whether there was any filtering on such trivial implications as well, but I couldn't find anything in the current library.

Comment on lines 365 to 383
// We use the `Store` instruction that writes the return value instead of the
// `ReturnValue` instruction since the `ReturnValue` instruction is not always
// dominated by certain guards. For example:
// ```
// if(b) {
// return true;
// } else {
// return false;
// }
// ```
// this will be translated into IR like:
// ```
// if(b) {
// x = true;
// } else {
// x = false;
// }
// return x;
// ```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm somewhat confused by this. Both code snippets should be equally well supported as wrapper guards, so if the latter doesn't work then something else is missing. Make sure you have test cases for both kinds of wrapper guards.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, wait. I think I know what was wrong with the old approach. Nothing to see here 🙈! I've deleted the incorrect comment and added a bunch of examples in c1c1f60. They all seem to work perfectly fine 🎉

@MathiasVP
Copy link
Contributor Author

MathiasVP commented Sep 29, 2025

@aschackmull thanks for all the review comments so far! I think I've fixed all of them, and DCA still looks fine. Do you have any other comments before I hand it over to the C team?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants