mirror of
https://chromium.googlesource.com/crosvm/crosvm
synced 2025-02-11 12:35:26 +00:00
main: remove EPOLLHUP epoll item from host kernel synchronously
control_sockets.swap_remove() could cause host kernel to invoke ep_remove() to remove the epoll item. But it's called from the task work, and it could be deferred after next poll_ctx.wait() which could unexpectedly pick up epoll events from the already closed fd. BUG=chromium:1019986 TEST=launch Crosvm guest from heavy loaded Linux host Change-Id: I474a7a47a484e3acfae4383d61601e1553bd674f Signed-off-by: Zide Chen <zide.chen@intel.com> Reviewed-on: https://chromium-review.googlesource.com/c/chromiumos/platform/crosvm/+/1917495 Reviewed-by: Zach Reizner <zachr@chromium.org> Reviewed-by: Daniel Verkamp <dverkamp@chromium.org> Tested-by: kokoro <noreply+kokoro@google.com>
This commit is contained in:
parent
f35d8904b8
commit
8958407dcb
1 changed files with 17 additions and 1 deletions
18
src/linux.rs
18
src/linux.rs
|
@ -1845,10 +1845,26 @@ fn run_control(
|
||||||
}
|
}
|
||||||
|
|
||||||
// Sort in reverse so the highest indexes are removed first. This removal algorithm
|
// Sort in reverse so the highest indexes are removed first. This removal algorithm
|
||||||
// preserved correct indexes as each element is removed.
|
// preserves correct indexes as each element is removed.
|
||||||
vm_control_indices_to_remove.sort_unstable_by(|a, b| b.cmp(a));
|
vm_control_indices_to_remove.sort_unstable_by(|a, b| b.cmp(a));
|
||||||
vm_control_indices_to_remove.dedup();
|
vm_control_indices_to_remove.dedup();
|
||||||
for index in vm_control_indices_to_remove {
|
for index in vm_control_indices_to_remove {
|
||||||
|
// Delete the socket from the `poll_ctx` synchronously. Otherwise, the kernel will do
|
||||||
|
// this automatically when the FD inserted into the `poll_ctx` is closed after this
|
||||||
|
// if-block, but this removal can be deferred unpredictably. In some instances where the
|
||||||
|
// system is under heavy load, we can even get events returned by `poll_ctx` for an FD
|
||||||
|
// that has already been closed. Because the token associated with that spurious event
|
||||||
|
// now belongs to a different socket, the control loop will start to interact with
|
||||||
|
// sockets that might not be ready to use. This can cause incorrect hangup detection or
|
||||||
|
// blocking on a socket that will never be ready. See also: crbug.com/1019986
|
||||||
|
if let Some(socket) = control_sockets.get(index) {
|
||||||
|
poll_ctx.delete(socket).map_err(Error::PollContextDelete)?;
|
||||||
|
}
|
||||||
|
|
||||||
|
// This line implicitly drops the socket at `index` when it gets returned by
|
||||||
|
// `swap_remove`. After this line, the socket at `index` is not the one from
|
||||||
|
// `vm_control_indices_to_remove`. Because of this socket's change in index, we need to
|
||||||
|
// use `poll_ctx.modify` to change the associated index in its `Token::VmControl`.
|
||||||
control_sockets.swap_remove(index);
|
control_sockets.swap_remove(index);
|
||||||
if let Some(socket) = control_sockets.get(index) {
|
if let Some(socket) = control_sockets.get(index) {
|
||||||
poll_ctx
|
poll_ctx
|
||||||
|
|
Loading…
Reference in a new issue