It looks like free() will sometimes try to open
/proc/sys/vm/overcommit_memory in order to decide whether to return
freed heap memory to the kernel; change the seccomp filter to fail the
open syscalls with an error code (ENOENT) rather than killing the
process.
Also allow madvise to free memory for the same free() codepath.
BUG=chromium:888212
TEST=Run fio loop test on kevin
Change-Id: I1c27b265b822771f76b7d9572d9759476770000e
Signed-off-by: Daniel Verkamp <dverkamp@chromium.org>
Reviewed-on: https://chromium-review.googlesource.com/1305756
Commit-Ready: ChromeOS CL Exonerator Bot <chromiumos-cl-exonerator@appspot.gserviceaccount.com>
Reviewed-by: Dylan Reid <dgreid@chromium.org>
This is needed to make sure seccomp work with glibc 2.27
BUG=chromium:897477
TEST=None
Change-Id: I101aa07bffd8db2b449be1a697dafcd7d6f1cb58
Reviewed-on: https://chromium-review.googlesource.com/1294729
Commit-Ready: Yunlian Jiang <yunlian@chromium.org>
Tested-by: Yunlian Jiang <yunlian@chromium.org>
Reviewed-by: Mike Frysinger <vapier@chromium.org>
This adds openat to a seccomp policy file if open is already there.
We need this because glibc 2.25 changed it system call for open().
BUG=chromium:894614
TEST=None
Change-Id: Ie5b45d858e8d9ea081fd7bfda81709bda048d965
Reviewed-on: https://chromium-review.googlesource.com/1292129
Commit-Ready: Yunlian Jiang <yunlian@chromium.org>
Tested-by: Yunlian Jiang <yunlian@chromium.org>
Reviewed-by: Manoj Gupta <manojgupta@chromium.org>
In deallocate_cluster(), we call set_cluster_refcount() to unref the
cluster that is being deallocated, but we never actually added the
deallocated cluster to the unref_clusters list. Add clusters whose
refcounts reach 0 to the unref_clusters list as well.
Also add mremap() to the seccomp whitelist for the block device, since
this is being triggered by libc realloc() and other devices already
include it in the whitelist.
BUG=chromium:850998
TEST=cargo test -p qcow; test crosvm on nami and verify that qcow file
size stays bounded when creating a 1 GB file and deleting it
repeatedly
Change-Id: I1bdd96b2176dc13069417e0ac77f0768f9f26012
Signed-off-by: Daniel Verkamp <dverkamp@chromium.org>
Reviewed-on: https://chromium-review.googlesource.com/1259404
Reviewed-by: Dylan Reid <dgreid@chromium.org>
Add newfstatat for x86 and fstatat64 for arm to the seccomp policy file
for the 9p device and server program.
BUG=chromium:886535
TEST=`vmc share termina foo` and then `ls /mnt/shared` inside the VM
works
Change-Id: I6871f54ae885e080dca0ea5751987d59c55a59d6
Signed-off-by: Chirantan Ekbote <chirantan@chromium.org>
Reviewed-on: https://chromium-review.googlesource.com/1232556
Reviewed-by: Stephen Barber <smbarber@chromium.org>
The path to the wayland socket changed, so the previous whitelist based
on the connect() arg2 sockaddr_un size now fails.
BUG=None
TEST=Verify that release build of crosvm starts again on chromebook
Change-Id: I3c30977e7c1487b937d69e1dbce4b7fd87136978
Signed-off-by: Daniel Verkamp <dverkamp@chromium.org>
Reviewed-on: https://chromium-review.googlesource.com/1234827
Reviewed-by: David Riley <davidriley@chromium.org>
Reviewed-by: Stephen Barber <smbarber@chromium.org>
Reviewed-by: Zach Reizner <zachr@chromium.org>
"devices: block: Flush a minute after a write" introduced new timerfd_
syscalls into the block device but did not add them to the seccomp
whitelist.
BUG=chromium:885238
TEST=Run crosvm in multiprocess mode and verify that it boots
Change-Id: I1568946c64d86ab7dba535a430a8cbe235f64454
Signed-off-by: Daniel Verkamp <dverkamp@chromium.org>
Reviewed-on: https://chromium-review.googlesource.com/1231513
Commit-Ready: Dylan Reid <dgreid@chromium.org>
Tested-by: Dylan Reid <dgreid@chromium.org>
Reviewed-by: Dylan Reid <dgreid@chromium.org>
Discard and Write Zeroes commands have been added to the virtio block
specification:
88c8553838
Implement both commands using the WriteZeroes trait.
BUG=chromium:850998
TEST=fstrim within termina on a writable qcow image
Change-Id: I33e54e303202328c10f7f2d6e69ab19f419f3998
Signed-off-by: Daniel Verkamp <dverkamp@chromium.org>
Reviewed-on: https://chromium-review.googlesource.com/1188680
Reviewed-by: Stephen Barber <smbarber@chromium.org>
Reviewed-by: Dylan Reid <dgreid@chromium.org>
Seccomp policy for ARM hosts was recently moved from aarch64 to arm to
accurately match the ABI used on the host. Move 9s policy to match this.
BUG=none
TEST=vm.Webserver on kevin succeeds
Change-Id: I97daa524edcd411618561ce07525738bc65457cb
Reviewed-on: https://chromium-review.googlesource.com/1180470
Commit-Ready: Stephen Barber <smbarber@chromium.org>
Tested-by: Stephen Barber <smbarber@chromium.org>
Reviewed-by: Sonny Rao <sonnyrao@chromium.org>
Implement a policy for the balloon device so that it starts taking
memory away from the VM when the system is under low memory conditions.
There are a few pieces here:
* Change the madvise call in MemoryMapping::dont_need_range to use
MADV_REMOVE instead of MADV_DONTNEED. The latter does nothing when
the memory mapping is shared across multiple processes while the
former immediately gives the pages in the specified range back to the
kernel. Subsequent accesses to memory in that range returns zero
pages.
* Change the protocol between the balloon device process and the main
crosvm process. Previously, the device process expected the main
process to send it increments in the amount of memory consumed by the
balloon device. Now, it instead just expects the absolute value of
the memory that should be consumed. To properly implement the policy
the main process needs to keep track of the total memory consumed by
the balloon device so this makes it easier to handle all the policy in
one place.
* Add a policy for dealing with low memory situations. When the VM
starts up, we determine the maximum amount of memory that the balloon
device should consume:
* If the VM has more than 1.5GB of memory, the balloon device max is
the size of the VM memory minus 1GB.
* Otherwise, if the VM has at least 500MB, the balloon device max is
50% of the size of the VM memory.
* Otherwise, the max is 0.
The increment used to change the size of the balloon is defined as
1/16 of the max memory that the balloon device will consume. When the
crosvm main process detects that the system is low on memory, it
immediately increases the balloon size by the increment (unless it has
already reached the max). It then starts 2 timers: one to check for
low memory conditions again in 1 seconds (+ jitter) and another to
check if the system is no longer low on memory in 1 minute (+ jitter)
with a subsequent interval of 30 seconds (+ jitter).
Under persistent low memory conditions the balloon device will consume
the maximum memory after 16 seconds. Once there is enough available
memory the balloon size will shrink back down to 0 after at most 9
minutes.
BUG=chromium:866193
TEST=manual
Start 2 VMs and write out a large file (size > system RAM) in each.
Observe /sys/kernel/mm/chromeos-low_mem/available and see that the
available memory steadily decreases until it goes under the low memory
margin at which point the available memory bounces back up as crosvm
frees up pages.
CQ-DEPEND=CL:1152214
Change-Id: I2046729683aa081c9d7ed039d902ad11737c1d52
Signed-off-by: Chirantan Ekbote <chirantan@chromium.org>
Reviewed-on: https://chromium-review.googlesource.com/1149155
Reviewed-by: Sonny Rao <sonnyrao@chromium.org>
These policies are not for aarch64 but use the 32-bit system calls.
We call it aarch64 support because that's what we're targetting for
the guest kernel, but it doesn't really make any sense to call the
seccomp policies aarch64 when we're building a 32-bit binary.
We can add real aarch64 seccomp policies when we start building a
aarch64 crosvm binary.
BUG=chromium:866197
TEST=emerge-kevin crosvm, run vm_CrosVmStart
CQ-DEPEND=CL:1145903
Change-Id: I7c5e70fbc127e4209ed392cfcf10ea36a6dd4b2c
Reviewed-on: https://chromium-review.googlesource.com/1145909
Commit-Ready: Sonny Rao <sonnyrao@chromium.org>
Tested-by: Sonny Rao <sonnyrao@chromium.org>
Reviewed-by: Stephen Barber <smbarber@chromium.org>
Reviewed-by: Zach Reizner <zachr@chromium.org>