MIR: Fixing Allocation Disjointness For Soundness

Alex Johnson
-
MIR: Fixing Allocation Disjointness For Soundness

Introduction

Hey guys! Today, we're diving deep into a critical issue within the MIR (Mid-Level Intermediate Representation) backend of SAW (Software Analysis Workbench) concerning the checks for allocation disjointness. It turns out there's a bit of a loophole that could lead to some unsound verifications. Let's break down the problem and how we can fix it to ensure our verifications are rock solid. Ensuring memory safety and preventing vulnerabilities often hinges on accurate memory access analysis. So, let's see how an insufficient allocation disjointness check can lead to unsoundness, and how to rectify it using mirRef_overlapsIO for more robust verification.

The Problem: Insufficient Disjointness Check

The main issue lies within the enforceDisjointness function. Currently, this function only verifies that references aren't equal. This is a problem because two references can be unequal and still point to the same memory allocation, meaning they aren't truly disjoint. When this happens, it can introduce unsoundness into our verification process.

Consider this Rust code:

pub fn f(x: *const u32, y: *const u32) -> bool {
 unsafe { x.add(1) == y }
}

pub fn g() -> bool {
 let a = [1, 2];
 let p = a.as_ptr();
 unsafe { f(p, p.add(1)) }
}

In this scenario, x and y in function f point to different locations within the same array a. The current enforceDisjointness doesn't catch this, leading to potentially incorrect verification results. Specifically, the existing disjointness check in the MIR backend only verifies that references are not equal, which is insufficient. References that are unequal but point to the same allocation are not disjoint, leading to unsoundness. For instance, SAW successfully verifies a scenario where pointers x and y point into the same allocation but are not equal, thus bypassing the disjointness check.

Here's the corresponding SAW script that demonstrates the issue:

enable_experimental;

m <- mir_load_module "test.linked-mir.json";

let f_spec = do {
 x <- mir_alloc_raw_ptr_const_multi 2 mir_u32;
 y <- mir_alloc_raw_ptr_const mir_u32;
 mir_execute_func [x, y];
 mir_return (mir_term {{ False }});
};

f_ov <- mir_verify m "test::f" [] false f_spec z3;

let g_spec0 = do {
 mir_execute_func [];
 mir_return (mir_term {{ False }});
};

mir_verify m "test::g" [f_ov] false g_spec0 z3;

let g_spec1 = do {
 mir_execute_func [];
 mir_return (mir_term {{ True }});
};

mir_verify m "test::g" [] false g_spec1 z3;

The equalRefsPred Optimization: Another Pitfall

To add to the complexity, the equalRefsPred function has an optimization that can also lead to unsoundness. If the references being compared have different types, the function simply returns false. This is problematic because pointer types can be cast, meaning two pointers of different types could point to the same memory location. The current reference equality check, equalRefsPred, includes an optimization where it returns false if the references have different types. This can lead to unsoundness since pointer types can be cast. Consider the following Rust code:

Consider the following Rust code:

pub fn f(x: *const u32, y: *const u8) -> bool {
 x == y as *const u32
}

pub fn g() -> bool {
 let a: u32 = 1;
 f(&raw const a, &raw const a as *const u8)
}

Here, x is a *const u32 and y is a *const u8, but they both point to the same memory location. The equalRefsPred optimization would incorrectly return false.

Here's the corresponding SAW script:

enable_experimental;

m <- mir_load_module "test.linked-mir.json";

let f_spec = do {
 x <- mir_alloc_raw_ptr_const mir_u32;
 y <- mir_alloc_raw_ptr_const mir_u8;
 mir_execute_func [x, y];
 mir_return (mir_term {{ False }});
};

f_ov <- mir_verify m "test::f" [] false f_spec z3;

let g_spec0 = do {
 mir_execute_func [];
 mir_return (mir_term {{ False }});
};

mir_verify m "test::g" [f_ov] false g_spec0 z3;

let g_spec1 = do {
 mir_execute_func [];
 mir_return (mir_term {{ True }});
};

mir_verify m "test::g" [] false g_spec1 z3;

The Solution: Using mirRef_overlapsIO

To address these issues, we need to modify enforceDisjointness to use mirRef_overlapsIO. This function, which is already used by crucible-mir-comp in its checkDisjoint function, provides a more accurate check for whether two references overlap in memory. Specifically, mirRef_overlapsIO checks for actual memory overlap, not just reference equality. Using this function ensures that we catch cases where references point to the same allocation, even if they are not equal.

Moreover, we need to ensure that mirRef_overlapsIO is called regardless of the types of the references being compared. This will prevent the unsoundness caused by the equalRefsPred optimization.

Why mirRef_overlapsIO?

mirRef_overlapsIO is designed to accurately determine if two memory references have any overlapping regions. This is crucial for ensuring memory safety and preventing unexpected behavior. By integrating this function into enforceDisjointness, we enhance the reliability of our memory access analysis. This approach ensures that the verification process is robust against different pointer types and casting scenarios. When enforceDisjointness uses mirRef_overlapsIO, it checks for actual memory overlap, catching cases where references point to the same allocation, even if they are not equal.

Implementation Details

Here’s what the implementation would involve:

  1. Modify enforceDisjointness: Replace the current equality check with a call to mirRef_overlapsIO.
  2. Remove Type Check in equalRefsPred: Ensure that mirRef_overlapsIO is always called, even if the reference types differ.
  3. Testing: Thoroughly test the changes to ensure that the unsoundness issues are resolved and no new issues are introduced. Include tests that specifically check for overlap within the same allocation and across different pointer types.

Benefits of the Fix

By implementing these changes, we can achieve:

  • Improved Soundness: Verifications will be more reliable, as we'll correctly identify when memory accesses are not disjoint.
  • Enhanced Memory Safety: By accurately detecting memory overlaps, we can prevent potential vulnerabilities and unexpected behavior.
  • Consistency: Aligning the MIR backend's disjointness checks with those used in crucible-mir-comp ensures a more consistent verification process.

Conclusion

In summary, the current implementation of enforceDisjointness in the MIR backend has shortcomings that can lead to unsound verifications. By switching to mirRef_overlapsIO and ensuring it’s always called, we can significantly improve the accuracy and reliability of our memory access analysis. This, in turn, leads to more robust and trustworthy software verification. Accurate memory access analysis is crucial for preventing vulnerabilities and ensuring program correctness. By adopting mirRef_overlapsIO, SAW can offer a more robust and reliable verification process.

By making these changes, we're not just patching a bug; we're reinforcing the foundation of our verification process. Keep pushing the boundaries, and always strive for correctness in every detail!

For more in-depth information on memory safety and verification, check out the Formal Methods at CMU website.

You may also like