Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

std: introduce pointer stability locks to hash maps #17719

Merged
merged 4 commits into from
Mar 16, 2024
Merged

Conversation

andrewrk
Copy link
Member

@andrewrk andrewrk commented Oct 26, 2023

This adds std.debug.SafetyLock and uses it in standard library hash maps by adding lockPointers() and unlockPointers().

This provides a way to detect when an illegal modification has happened and panic rather than invoke undefined behavior.

Example

const std = @import("std");

pub fn main() !void {
    const gpa = std.heap.page_allocator;
    var map: std.AutoHashMapUnmanaged(i32, i32) = .{};

    const gop = try map.getOrPut(gpa, 1234);
    map.lockPointers();
    defer map.unlockPointers();

    gop.value_ptr.* = try calculate(gpa, &map);
}

fn calculate(gpa: std.mem.Allocator, m: anytype) !i32 {
    try m.put(gpa, 42, 420);
    return 999;
}
$ zig run test3.zig 
thread 96878 panic: reached unreachable code
/home/andy/Downloads/zig/lib/std/debug.zig:342:14: 0x220772 in assert (test3)
    if (!ok) unreachable; // assertion failure
             ^
/home/andy/Downloads/zig/lib/std/debug.zig:2641:15: 0x22125b in lock (test3)
        assert(l.state == .unlocked);
              ^
/home/andy/Downloads/zig/lib/std/hash_map.zig:1309:40: 0x251a23 in getOrPutContextAdapted__anon_7181 (test3)
                self.pointer_stability.lock();
                                       ^
/home/andy/Downloads/zig/lib/std/hash_map.zig:1296:56: 0x221165 in getOrPutContext (test3)
            const gop = try self.getOrPutContextAdapted(allocator, key, ctx, ctx);
                                                       ^
/home/andy/Downloads/zig/lib/std/hash_map.zig:1222:52: 0x2212ba in putContext (test3)
            const result = try self.getOrPutContext(allocator, key, ctx);
                                                   ^
/home/andy/Downloads/zig/lib/std/hash_map.zig:1219:35: 0x21e353 in put (test3)
            return self.putContext(allocator, key, value, undefined);
                                  ^
/home/andy/Downloads/zig/build-release/test3.zig:15:14: 0x21e2e6 in calculate__anon_3623 (test3)
    try m.put(gpa, 42, 420);
             ^
/home/andy/Downloads/zig/build-release/test3.zig:11:36: 0x21e51b in main (test3)
    gop.value_ptr.* = try calculate(gpa, &map);
                                   ^
/home/andy/Downloads/zig/lib/std/start.zig:583:37: 0x21e1f5 in posixCallMainAndExit (test3)
            const result = root.main() catch |err| {
                                    ^
/home/andy/Downloads/zig/lib/std/start.zig:251:5: 0x21dce1 in _start (test3)
    asm volatile (switch (native_arch) {
    ^
???:?:?: 0x0 in ??? (???)
Aborted (core dumped)

Something nice about this is that if you use the "assume capacity" variants then such mutations are actually well-defined, and don't trigger the problem:

const std = @import("std");

pub fn main() !void {
    const gpa = std.heap.page_allocator;
    var map: std.AutoHashMapUnmanaged(i32, i32) = .{};

    try map.ensureUnusedCapacity(gpa, 2);
    const gop = map.getOrPutAssumeCapacity(1234);
    map.lockPointers();
    defer map.unlockPointers();

    gop.value_ptr.* = calculate(&map);
}

fn calculate(m: anytype) i32 {
    m.putAssumeCapacity(42, 420);
    return 999;
}
$ stage3/bin/zig run test3.zig 

Merge Checklist

  • add to managed variants
  • add to array hash maps

Follow-Up Issues

andrewrk added a commit that referenced this pull request Oct 26, 2023
The main problem being fixed here is there was a getOrPut() that held on
to a reference to the value pointer too long, and meanwhile the call to
`lowerConst` ended up being recursive and mutating the hash map,
invoking undefined behavior.

caught via #17719
pub const SafetyLock = struct {
state: State = .unlocked,

pub const State = if (runtime_safety) enum { unlocked, locked } else enum { unlocked };
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not else void to not rely on the optimizer? Would this break @setRuntimeSafety?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

an enum with only 1 tag is a 0-bit type just like void

Copy link
Sponsor Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Today I learned. 😍

@GethDW
Copy link
Contributor

GethDW commented Oct 30, 2023

I think it would be good to add a ConfigurableTrace and panic with it to quickly identify where it was locked. I'm not too familiar with StackTrace but I played around with it and got:

pub const SafetyLock = struct {
    state: State = .unlocked,
    trace: std.debug.ConfigurableTrace(2, 1, runtime_safety) = .{},

    pub const State = if (runtime_safety) enum { unlocked, locked } else enum { unlocked };

    pub fn lock(l: *SafetyLock) void {
        if (!runtime_safety) return;
        l.trace.addAddr(@returnAddress(), "locked");
        if (l.state == .locked) l.panic("locked while already locked");
        l.state = .locked;
    }

    pub fn unlock(l: *SafetyLock) void {
        if (!runtime_safety) return;
        l.trace.addAddr(@returnAddress(), "unlock");
        if (l.state == .unlocked) l.panic("unlocked while already unlocked");
        l.trace = .{};
        l.state = .unlocked;
    }

    pub fn assertUnlocked(l: *SafetyLock, comptime msg: []const u8, ret_addr: usize) void {
        if (!runtime_safety) return;
        l.trace.addAddr(ret_addr, msg);
        if (l.state != .unlocked) l.panic(msg);
    }

    fn panic(l: SafetyLock, comptime msg: []const u8) noreturn {
        var insts = [2]usize{ l.trace.addrs[0][0], l.trace.addrs[1][0] };
        const stack_trace = std.builtin.StackTrace{
            .index = l.trace.index,
            .instruction_addresses = &insts,
        };
        std.debug.panicImpl(&stack_trace, null, msg);
    }
};

The trace can then look like this:

thread 343239 panic: modified while locked
/home/gethin/random/test/src/main.zig:53:14: 0x21d788 in main (test)
    lock.lock(); // locked
             ^
/home/gethin/random/test/src/main.zig:54:11: 0x21d791 in main (test)
    modify(&lock); // modified
...

This adds std.debug.SafetyLock and uses it in std.HashMapUnmanaged by
adding lockMutation() and unlockMutation().

This provides a way to detect when an illegal mutation has happened and
panic rather than invoke undefined behavior.
@andrewrk
Copy link
Member Author

andrewrk commented Mar 16, 2024

@GethDW that would be handy. I think I could see that being a global std lib option perhaps, to enable sometimes. I would imagine the stack trace collection might be a bit punishing in terms of performance. That said, we could try it and find out for sure.

Let's explore that in a follow-up proposal.

@andrewrk andrewrk added release notes This PR should be mentioned in the release notes. enhancement Solving this issue will likely involve adding new logic or components to the codebase. standard library This issue involves writing Zig code for the standard library. labels Mar 16, 2024
/// assertion.
///
/// `unlockMutation` returns the hash map to the previous state.
pub fn lockMutation(self: *Self) void {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would lockRealloc() be a better name for this? Some mutation is allowed, for example the fooAssumeCapacity() function variants as you mentioned.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a nice name. I'll make that change.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On second thought, "realloc" is not the right concept either since it also gets triggered for clearing and shrinking (with retained capacity).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed! I think what you've landed on with lockPointers() is great :)

@andrewrk andrewrk changed the title std: introduce mutation locks to hash maps std: introduce pointer stability locks to hash maps Mar 16, 2024
@andrewrk andrewrk merged commit 242ab81 into master Mar 16, 2024
10 checks passed
@andrewrk andrewrk deleted the std-mutation-lock branch March 16, 2024 22:45
@andrewrk
Copy link
Member Author

FYI @mlugg I imagine this tool will be quite handy when working on incremental compilation stuff.

RossComputerGuy pushed a commit to ExpidusOS/zig that referenced this pull request Mar 20, 2024
This adds std.debug.SafetyLock and uses it in std.HashMapUnmanaged by
adding lockPointers() and unlockPointers().

This provides a way to detect when an illegal modification has happened and
panic rather than invoke undefined behavior.
@andrewrk andrewrk added this to the 0.12.0 milestone Apr 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Solving this issue will likely involve adding new logic or components to the codebase. release notes This PR should be mentioned in the release notes. standard library This issue involves writing Zig code for the standard library.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants