-
Notifications
You must be signed in to change notification settings - Fork 121
RFC: Partitioned proofs #4228
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
RFC: Partitioned proofs #4228
Changes from 3 commits
22dd697
369aad3
d6df026
ed3faf1
ed8ae0e
3d0ac20
412f080
b9243e4
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,138 @@ | ||
- **Feature Name:** Harness Partition (`harness-partition`) | ||
- **Feature Request Issue:** [#3006](https://github.com/model-checking/kani/issues/3006) | ||
- **RFC PR:** https://github.com/model-checking/kani/pull/4228 | ||
- **Status:** Under Review | ||
- **Version:** 0 | ||
- **Proof-of-concept:** [prototype on local branch](https://github.com/model-checking/kani/compare/main...AlexanderPortland:kani:harness-partitioning) | ||
|
||
------------------- | ||
|
||
## Summary | ||
|
||
It can often be useful to subdivide an expensive proof harness so that different parts of the input space are verified separately. Adding the built-in ability to partition a proof harness into different pieces (that each make differing assumptions about their inputs) could reduce the cost of expensive proofs, while allowing us to automatically check that the partitions cover the entire input space and, thus, will not affect soundness. | ||
|
||
## User Impact | ||
|
||
Imagine that you have a function to verify like the following (based on the example from [#3006](https://github.com/model-checking/kani/issues/3006)). Since there are two tricky to analyze function calls, but only one will ever be called on a given input, you might want to verify all inputs that will take the first branch separately from those that will take the second. This way, each solve would be smaller in isolation, and you'd be able to take advantage of CBMC's internal parallelism by running both proofs at once. | ||
AlexanderPortland marked this conversation as resolved.
Show resolved
Hide resolved
AlexanderPortland marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
```rust | ||
pub fn target_fn(input: i32) -> isize { | ||
if input > 0 { | ||
hard_to_analyze_fn_1(input) | ||
} else { | ||
hard_to_analyze_fn_2(input) | ||
} | ||
} | ||
AlexanderPortland marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
#[kani::proof] | ||
pub fn proof_harness() { | ||
let input = kani::any(); | ||
assert!(target_fn(input) > 0) | ||
} | ||
``` | ||
|
||
The best way to currently do this would be to manually partition out these paths into two proof harnesses. | ||
|
||
```rust | ||
#[kani::proof] | ||
pub fn first_branch_harness() { | ||
let input = kani::any_where(|i: &i32| *i > 0i32); | ||
assert!(target_fn(input) > 0) | ||
} | ||
|
||
#[kani::proof] | ||
pub fn second_branch_harness() { | ||
let input = kani::any_where(|i: &i32| *i < 0i32); // This should've been i <= 0 | ||
assert!(target_fn(input) > 0) | ||
} | ||
``` | ||
AlexanderPortland marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
However, this can affect soundess, as there's no guarantee that your partitions will fully span the space of possible inputs. The only way to determine that a set of proofs like the one above are incorrect (as it forgets to account for when `i == 0`) is by manual inspection, which gets infeasible for proofs with complex types or partition rules. | ||
AlexanderPortland marked this conversation as resolved.
Show resolved
Hide resolved
AlexanderPortland marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
AlexanderPortland marked this conversation as resolved.
Show resolved
Hide resolved
|
||
It would be helpful if Kani provided this as a built-in feature that could reason about given partition conditions to provide soundess guarantees. | ||
AlexanderPortland marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
## User Experience | ||
AlexanderPortland marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
The current thought is for users to interact with this using a new `#[kani::partitioned_proof()]` attribute where they provide the conditions by which to partition the input space of their proof. Each condition must be of type `fn(&T) -> bool` (where `T` is the input type to the proof body that implements `kani::Arbitrary`) and is used to filter which values are part of that partition. | ||
AlexanderPortland marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
For example, the above would become | ||
|
||
```rust | ||
#[kani::partitioned_proof(|a: &i32| { *a > 0 }, |a: &i32| { *a < 0 })] | ||
pub fn partitioned_harness(input: i32) { | ||
assert!(target_fn(input) > 0) | ||
} | ||
AlexanderPortland marked this conversation as resolved.
Show resolved
Hide resolved
|
||
``` | ||
|
||
And Kani would automatically handle checking the partition conditions for soundess and generating the partitioned proofs. Overlaps would be allowed as they don't affect soundess and may be useful in certain cases (see [below](#2-checking-for-overlapping-partitions) for more discussion). | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Just curious: how do you actually accomplish the soundness check? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The thought (and implementation in the previous prototype) is that you would generate a new proof harness that declares |
||
|
||
Generally, use of this feature would look like the following (where `T` is an arbitrary input type). | ||
|
||
```rust | ||
#[kani::partitioned_proof(|a: &T| { #condition_1# }, |a: &T| { #condition_2# }, ..., |a: &T| { #condition_n# })] | ||
pub fn harness(input: T) { | ||
/* your interesting assertions here */ | ||
} | ||
``` | ||
|
||
## Software Design | ||
|
||
A prototype of this design has been implemented locally [here](https://github.com/AlexanderPortland/kani/tree/harness-partitioning). | ||
|
||
It introduces the `#[kani::partitioned_proof()]` attribute as variant of `#[kani::proof]` that takes in a set of closures representing the partition bounds. With those bounds, it keeps the main harness function the same, but does not annotate it with the `#kani_attributes` that would typically mark it as an entrypoint for Kani's verification. Instead, for each partition, it will generate a new function definition for that portion of the proof. These functions are just wrappers that constrain an arbitrary value to be within the partition and then call the real harness that will contain the meat of the proof. Their names are derived from a hash of their condition closure for uniqueness (but see [below](#open-questions) for discussion on how to improve this). So, for the running example above, the call to `kani::partitioned_proof` generates: | ||
|
||
```rust | ||
#[kani::proof] | ||
// (hash of |a| *a > 0) | ||
AlexanderPortland marked this conversation as resolved.
Show resolved
Hide resolved
|
||
pub fn partitioned_harness_7199877664941740246() { | ||
let t = kani::any_where(|a: &i32| { *a > 0 }); | ||
partitioned_harness(t) | ||
AlexanderPortland marked this conversation as resolved.
Show resolved
Hide resolved
|
||
} | ||
|
||
#[kani::proof] | ||
// (hash of |a| *a < 0) | ||
AlexanderPortland marked this conversation as resolved.
Show resolved
Hide resolved
|
||
pub fn partitioned_harness_2423413845937420568() { | ||
let t = kani::any_where(|a: &i32| { *a < 0 }); | ||
partitioned_harness(t) | ||
} | ||
``` | ||
|
||
It will then also inject a new proof that has an assert to ensure that all possible values fall into one of the given partitions. In this case, it will fail to be verified (as `t` can be 0), telling the user that their partitions are not complete and could affect proof soundness. | ||
AlexanderPortland marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
|
||
```rust | ||
#[kani::proof] | ||
pub fn partitioned_harness_missing_full_coverage() { | ||
let t: i32 = kani::any(); | ||
let partitions: [fn(&i32) -> bool; 2] = [|a: &i32| { *a > 0 }, |a: &i32| { *a < 0 }]; | ||
let partitions_have_full_coverage: bool = partitions.into_iter().any(|condition| condition(&t)); | ||
assert!(partitions_have_full_coverage) | ||
} | ||
``` | ||
|
||
### Corner cases | ||
1. This current implementation could run into issues if the main proof harness & partition condition closures both don't specify concrete types (e.g. by using generics), as the compiler may not be able to determine what `T` to use `kani::any()` on. This should be fixed by explicitly enforcing the current implicit assumption that the input type is a concrete type. | ||
2. This implementation doesn't play nicely if your proof is intentionally only covers a subspace of inputs (e.g. if you're trying to partition a proof that initially used `kani::any_where(...)`). (see the open questions [below](#open-questions) for thoughts on how to fix this) | ||
AlexanderPortland marked this conversation as resolved.
Show resolved
Hide resolved
AlexanderPortland marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
## Rationale and alternatives | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I would expand this section a bit with some rationale for your design, namely:
You've done a lot of thinking to settle on this design, so it'd be good to reflect that in the writing by (briefly) mentioning other points in the design space and why you decided they weren't as good. |
||
|
||
### 1. Alternative APIs | ||
The `kani::partition([i > 0, i < 0], || target_fn(input))` syntax suggested ([here](https://github.com/model-checking/kani/issues/3006#issue-2123964835)) in the initial issue is clear, but runs into some implementation issues irregardless of whether it's a function call or macro. The goal of this feature is to generate multiple proof harnesses from the single call to `partition`, and this is very difficult to do that from a macro that's already inside a function (as your codegen is limited to within that function's scope). | ||
AlexanderPortland marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
The design described above is simpler as it is an attribute macro and, thus, has access to the program's global scope, allowing it to keep the proof body the same and simply generate additional function definition wrappers for each partitioned proof. | ||
|
||
This was also initially implemented as an addition to the existing `#[kani::proof]` attribute, where users would specify partition conditions with `kani::proof(partitions = [|a| a % 2 == 9, ...])`. However, since these kinds of proofs require that the annotated function take an input, while `#[kani::proof]` functions typically don't, it seemed more consistent to introduce a separate attribute macro. | ||
AlexanderPortland marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
### 2. Checking for overlapping partitions | ||
As a corellary to checking that partitions span the space of all possible inputs, I had initially considered giving a warning if partitions are overlapping as overlap could indicate ill-defined bounds. However, I decided against this, as overlapping partitions may be useful in certain cases where the user wishes to purposefully oversimplify their partition conditions in a way that will cause overlap but make them easier to express. | ||
AlexanderPortland marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
## Open questions | ||
AlexanderPortland marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
1. What's the best way to ensure that type errors in the `kani::partitioned_proof` macro are clear, understandable and actionable for users? | ||
2. Can function names be generated more prettily while still remainging unique? (the current hashes are often opaque for users) | ||
AlexanderPortland marked this conversation as resolved.
Show resolved
Hide resolved
|
||
3. Are there any additional correctness issues introduced by this approach? | ||
4. Is there a way to simplify the generated `..._full_coverage()` proof harness? The iteration seems to be instrumented by our code which adds additional checks. | ||
5. Is there a way to promote the fact that verification failed on the `..._full_coverage()` proof harness as an error that could affect soundess? If Kani's users tend to look into any failed checks, this may not be needed, but it may be possible for a single failure to get lost in big projects. | ||
6. Would it be desirable to add an optional argument that specifies which subset of inputs you're trying to verify with the proof as a whole (if not provided just the whole input space)? | ||
AlexanderPortland marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
[^unstable_feature]: This unique ident should be used to enable features proposed in the RFC using `-Z <ident>` until the feature has been stabilized. |
Uh oh!
There was an error while loading. Please reload this page.