Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
66 changes: 42 additions & 24 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

7 changes: 6 additions & 1 deletion architecture/sandbox.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,12 @@ Each sandbox workload has two trust levels:
| Agent child | Runs as an unprivileged user with filesystem, process, and network restrictions applied. |

The supervisor keeps enough privilege to manage the sandbox, but the agent child
loses that privilege before user code runs.
loses that privilege before user code runs. On Linux, child setup clears the
capability bounding set during privilege drop so later execs cannot regain
container-granted capabilities. This is fail-closed: the supervisor retains
`CAP_SETPCAP` solely to perform the clear, and spawning the workload or SSH shell
aborts unless the bounding set ends up empty. A `setpcap` `EPERM` is tolerated
only when the set is already empty; any other outcome fails the spawn.

## Startup Flow

Expand Down
23 changes: 16 additions & 7 deletions crates/openshell-driver-podman/src/container.rs
Original file line number Diff line number Diff line change
Expand Up @@ -877,8 +877,6 @@ pub fn build_container_spec_with_token_and_gpu_devices(
"NET_RAW".into(),
// Not needed: the supervisor does not manipulate file capabilities.
"SETFCAP".into(),
// Not needed: the supervisor does not manage its own capability bounding set.
"SETPCAP".into(),
// Not needed: the supervisor does not call chroot().
"SYS_CHROOT".into(),
],
Expand All @@ -899,13 +897,18 @@ pub fn build_container_spec_with_token_and_gpu_devices(
// Without it the proxy cannot determine which binary made each outbound
// connection and all traffic is denied.
"DAC_READ_SEARCH".into(),
// Child setup clears the capability bounding set before exec, which
// requires CAP_SETPCAP in the supervisor until drop_privileges().
"SETPCAP".into(),
],
// SETUID, SETGID, CHOWN, and FOWNER are intentionally kept from Podman's
// default set and not dropped:
// SETUID, SETGID, SETPCAP, CHOWN, and FOWNER are intentionally kept from
// Podman's default set and not dropped:
// SETUID/SETGID – drop_privileges(): setuid()/setgid()/initgroups() to the
// sandbox user. In rootless Podman cap_drop:ALL removes them
// from the bounding set even though uid=0 owns the user
// namespace — so we keep them by not dropping them explicitly.
// SETPCAP – drop_privileges(): clears the child capability
// bounding set before the sandbox user execs.
// CHOWN – prepare_filesystem(): chown(path, uid, gid) on newly
// created read_write directories so the sandbox user can
// write to them.
Expand Down Expand Up @@ -1451,12 +1454,14 @@ mod tests {
added.contains(&"DAC_READ_SEARCH"),
"missing DAC_READ_SEARCH"
);
assert!(added.contains(&"SETPCAP"), "missing SETPCAP");

// SETUID and SETGID are NOT in cap_add — they remain available from the
// default bounding set because we no longer use cap_drop:ALL. Verify they
// are also not explicitly dropped. Similarly CHOWN and FOWNER must not be
// dropped because prepare_filesystem() calls chown() on newly created
// read_write directories before the supervisor drops privileges.
// are also not explicitly dropped. Similarly SETPCAP, CHOWN and FOWNER
// must not be dropped because child setup clears the bounding set and
// prepare_filesystem() calls chown() on newly created read_write
// directories before the supervisor drops privileges.
let dropped: Vec<&str> = spec["cap_drop"]
.as_array()
.expect("cap_drop should be an array")
Expand All @@ -1473,6 +1478,10 @@ mod tests {
!dropped.contains(&"FOWNER"),
"FOWNER must not be dropped (needed for chown on non-owned files)"
);
assert!(
!dropped.contains(&"SETPCAP"),
"SETPCAP must not be dropped (needed for child bounding-set clear)"
);
assert!(
!dropped.contains(&"ALL"),
"must not use cap_drop:ALL in rootless Podman"
Expand Down
1 change: 1 addition & 0 deletions crates/openshell-supervisor-process/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,7 @@ libc = "0.2"
rustix = { workspace = true }

[target.'cfg(target_os = "linux")'.dependencies]
capctl = "0.2.4"
landlock = "0.4"
seccompiler = "0.5"
tempfile = "3"
Expand Down
Loading
Loading