← Back to issue list

fix: synchronize with udev via BSD flock after losetup to avoid race condition

View original Github issue

Metadata

Project
imagecraft
Number
#330
Type
pull request
State
open
Author
Copilot
Labels
Created
2026-04-21 22:07:34+00:00
Updated
2026-05-03 00:47:33+00:00
Closed

Current evaluation

Resolves a udev inotify race condition causing loop partition mount failures by implementing a brief BSD flock synchronization barrier, with implementation refined per reviewer feedback and integration tests added.

Suggested action: needs review

Reason: The PR has been updated in response to maintainer feedback and includes new integration tests, but a maintainer expressed doubt about the fix's effectiveness 29 days ago. It requires further technical review and validation from the maintainer before it can be merged or closed.

Staleness: 30 Complexity: 45

Issue body

After `losetup --find --show --partscan`, udev holds an exclusive BSD flock on the whole-disk device while processing it. Mounting immediately after attach can fail with `special device /dev/loopNpM does not exist` because udev's inotify watch hasn't finished. The correct fix per <a href="https://systemd.io/BLOCK_DEVICE_LOCKING/">systemd BLOCK_DEVICE_LOCKING</a> is to briefly acquire a shared flock on the loop device as a synchronization barrier — blocking until udev releases its exclusive lock — then release it immediately so udev remains free to process further events. ## Changes - **`imagecraft/pack/image.py`** - `Image.attach_loopdev()` briefly acquires `fcntl.LOCK_SH` on the loop device after attaching (inside the fresh-attach branch). The lock is acquired and released before `yield`, ensuring udev has finished processing without holding the lock for the duration of the context manager. - **`imagecraft/services/image.py`** - `ImageService.attach_images()` acquires and immediately releases `LOCK_SH` on each loop device (fresh and reused) using a `with open(dev, "rb") as loop_fd: flock(...)` block. No persistent file handles are stored; no flock cleanup is needed in `detach_images()`. - **Tests** - Removed polling-based tests; tests now mock `fcntl.flock` and `builtins.open` - Added `test_attach_images_flock_sync_and_release` (unit) to verify the shared lock is acquired and immediately released (not held persistently) - Added `test_attach_images_partition_nodes_exist` (integration, requires root) to verify that every partition device node returned by `get_loop_paths()` actually exists on the filesystem as a block device immediately after `attach_images()` returns — directly demonstrating that the flock synchronization prevents the udev race condition ```python # After losetup --find --show --partscan returns /dev/loop7: with open("/dev/loop7", "rb") as loop_fd: fcntl.flock(loop_fd, fcntl.LOCK_SH) # blocks until udev is done, then releases # lock released — partition nodes are guaranteed to exist, udev free to continue yield "/dev/loop7" ```

Evaluation history

Date Model Scores Action Summary
2026-06-01 11:15:36.051727+00:00 qwen3.6-35b-moe-q4
Staleness: 30
Complexity: 45
needs review Resolves a udev inotify race condition causing loop partition mount failures by implementing a brief BSD flock synchronization barrier, with implementation refined per reviewer feedback and integration tests added.
2026-06-01 11:11:49.551693+00:00 pending