No description
Find a file
Zide Chen 4d7fd7f93d devices: virtio: set zero length mmio eventfd for VQ notification address
By registering the notify address with Datamatch::AnyLength, KVM is able
to take advantage of KVM_FAST_MMIO_BUS to accelerate eventfd handling.

Seems this doesn't violate the virtio spec because "writing the 16-bit
virtqueue index" refers to the implementation of front-end driver, for
example, Linux's vp_notify() function. While how to handle the VQ index
write in VMM is not covererd by virtio spec. Here Crosvm ensures that
every VQ has dedicated notify address and KVM implements the notification
by eventfd, which should be fine with the spec.

On eve Pixelbook (R77 host kernel 4.4.185), in 5000 samples, on average
the MMIO write vmexit takes 0.96us with fast_mmio enabled, while it takes
3.36us without fast_mmio.

without fast_mmio:
    232812.822491: kvm_exit: reason EPT_MISCONFIG rip 0xffffffff986f18ef info 0 0
    232812.822492: kvm_emulate_insn: 0:ffffffff986f18ef:66 89 3e (prot64)
    232812.822493: vcpu_match_mmio: gva 0xffffb0f4803a1004 gpa 0xe000f004 Write GPA
    232812.822493: kvm_mmio: mmio write len 2 gpa 0xe000f004 val 0x1
    232812.822495: kvm_entry: vcpu 1

with fast_mmio:
    230585.034396: kvm_exit: reason EPT_MISCONFIG rip 0xffffffff9a6f18ef info 0 0
    230585.034397: kvm_fast_mmio: fast mmio at gpa 0xe000f004
    230585.034397: kvm_entry: vcpu 1

BUG=chromium:993488
TEST=Boot Crostini on eve and run iperf benchmark
TEST=Analysis kernel trace for vmexit handling time

Change-Id: Id1dac22b37490f7026b6c119c85ca9d104a8a3f4
Signed-off-by: Zide Chen <zide.chen@intel.corp-partner.google.com>
Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.corp-partner.google.com>
Reviewed-on: https://chromium-review.googlesource.com/c/chromiumos/platform/crosvm/+/1762282
Reviewed-by: Daniel Verkamp <dverkamp@chromium.org>
Tested-by: kokoro <noreply+kokoro@google.com>
Commit-Queue: Daniel Verkamp <dverkamp@chromium.org>
2019-08-23 19:49:54 +00:00
aarch64 crosvm: virtio-pmem device 2019-06-05 07:28:54 +00:00
arch Allow to connect standard input to a serial port other than the guest console 2019-08-03 20:03:23 +00:00
assertions edition: Remove extern crate lines 2019-04-15 02:06:08 -07:00
bin crosvm: add license blurb to bin/clippy and bin/fmt files 2019-06-08 04:27:35 +00:00
bit_field crosvm: update xhci abi to use new bit_field features 2019-05-25 02:31:16 -07:00
crosvm_plugin crosvm: use pipe instead of socket for vcpu communication 2019-04-24 15:51:11 -07:00
data_model data_model: allow reading structs from io::Read 2019-08-14 03:44:28 +00:00
devices devices: virtio: set zero length mmio eventfd for VQ notification address 2019-08-23 19:49:54 +00:00
docker Dockerfile: stop tracking virglrenderer master 2019-08-20 17:46:23 +00:00
enumn enumn: fix duplicate fn in doc tests 2019-05-30 01:11:16 +00:00
fuzz fuzz: zimage: use a fixed guest memory size 2019-06-25 17:12:05 +00:00
gpu_buffer gpu_display: add X11 backend 2019-07-25 22:15:48 +00:00
gpu_display gpu_display: fix clippy warnings 2019-08-20 21:21:26 +00:00
gpu_renderer devices: gpu: remove BackedBuffer/GpuRendererResource distinction 2019-08-20 17:46:26 +00:00
io_jail edition: Remove extern crate lines 2019-04-15 02:06:08 -07:00
kernel_cmdline tree-wide: update to new inclusive range syntax 2019-07-24 02:22:21 +00:00
kernel_loader kernel_loader: check phdr memory size addition 2019-06-25 17:12:06 +00:00
kokoro add docker supported builds and tests 2019-05-15 13:36:19 -07:00
kvm edition: Eliminate blocks superseded by NLL 2019-04-17 17:22:57 -07:00
kvm_sys crosvm: add license blurb to all files 2019-04-24 15:51:38 -07:00
msg_socket eliminate usage of uninitialized 2019-05-23 07:35:18 -07:00
net_sys net_sys: regenerate if.h bindings using Rust native union 2019-05-23 02:14:24 -07:00
net_util net_sys: regenerate if.h bindings using Rust native union 2019-05-23 02:14:24 -07:00
p9 edition: Eliminate ref keyword 2019-04-18 19:51:01 -07:00
protos crosvm: add license blurb to all files 2019-04-24 15:51:38 -07:00
qcow qcow: bounds check the refcount table offset and size 2019-07-31 09:37:34 +00:00
qcow_utils edition: Remove extern crate lines 2019-04-15 02:06:08 -07:00
rand_ish clippy: Resolve const_static_lifetime 2019-04-17 17:22:50 -07:00
render_node_forward edition: Remove extern crate lines 2019-04-15 02:06:08 -07:00
resources crosvm: virtio-pmem device 2019-06-05 07:28:54 +00:00
seccomp gpu: Add sandboxing support for pvr. 2019-08-01 19:34:05 +00:00
src crosvm: silence unused code warning for NonZeroU8 2019-08-22 20:43:24 +00:00
sync edition: Update sync crate to 2018 edition 2019-03-17 14:38:44 -07:00
sys_util sys_util: drop redundant empty return type 2019-07-30 05:35:30 +00:00
syscall_defines edition: Update syscall_defines to 2018 edition 2019-04-07 16:31:17 -07:00
tempfile tempfile: Unify the two tempdir implementations 2019-07-11 16:15:38 -07:00
tests edition: Eliminate blocks superseded by NLL 2019-04-17 17:22:57 -07:00
tpm2 crosvm: add license blurb to all files 2019-04-24 15:51:38 -07:00
tpm2-sys
usb_util usb: switch to new libusb_wrap_sys_device API 2019-06-27 17:51:06 +00:00
vfio_sys vfio_sys: Add vfio.h to vfio_sys 2019-08-16 23:16:53 +00:00
vhost clippy: Resolve cast_ptr_alignment 2019-04-18 19:51:29 -07:00
virtio_sys crosvm: add license blurb to all files 2019-04-24 15:51:38 -07:00
vm_control crosvm: {WlDriverRequest, WlDriverResponse} --> {VmMemoryRequest, VmMemoryResponse} 2019-05-24 15:09:26 -07:00
x86_64 crosvm: virtio-pmem device 2019-06-05 07:28:54 +00:00
.dockerignore add docker supported builds and tests 2019-05-15 13:36:19 -07:00
.gitignore
.gitmodules
.rustfmt.toml
build_test
build_test.py build_test.py: test more packages 2019-07-03 20:39:50 +00:00
Cargo.lock tempfile: Unify the two tempdir implementations 2019-07-11 16:15:38 -07:00
Cargo.toml split crosvm into a library and a main "crosvm" binary 2019-08-06 19:23:06 +00:00
LICENSE
OWNERS Add an OWNERS file 2019-05-25 02:31:07 -07:00
README.md add docker supported builds and tests 2019-05-15 13:36:19 -07:00
rust-toolchain rust-toolchain: upgrade to Rust 1.36.0 2019-07-30 05:35:31 +00:00

crosvm - The Chrome OS Virtual Machine Monitor

This component, known as crosvm, runs untrusted operating systems along with virtualized devices. No actual hardware is emulated. This only runs VMs through the Linux's KVM interface. What makes crosvm unique is a focus on safety within the programming language and a sandbox around the virtual devices to protect the kernel from attack in case of an exploit in the devices.

Building with Docker

See the README from the docker subdirectory to learn how to build crosvm in enviroments outside of the Chrome OS chroot.

Usage

To see the usage information for your version of crosvm, run crosvm or crosvm run --help.

Boot a Kernel

To run a very basic VM with just a kernel and default devices:

$ crosvm run "${KERNEL_PATH}"

The uncompressed kernel image, also known as vmlinux, can be found in your kernel build directory in the case of x86 at arch/x86/boot/compressed/vmlinux.

Rootfs

In most cases, you will want to give the VM a virtual block device to use as a root file system:

$ crosvm run -r "${ROOT_IMAGE}" "${KERNEL_PATH}"

The root image must be a path to a disk image formatted in a way that the kernel can read. Typically this is a squashfs image made with mksquashfs or an ext4 image made with mkfs.ext4. By using the -r argument, the kernel is automatically told to use that image as the root, and therefore can only be given once. More disks can be given with -d or --rwdisk if a writable disk is desired.

To run crosvm with a writable rootfs:

WARNING: Writable disks are at risk of corruption by a malicious or malfunctioning guest OS.

crosvm run --rwdisk "${ROOT_IMAGE}" -p "root=/dev/vda" vmlinux

NOTE: If more disks arguments are added prior to the desired rootfs image, the root=/dev/vda must be adjusted to the appropriate letter.

Control Socket

If the control socket was enabled with -s, the main process can be controlled while crosvm is running. To tell crosvm to stop and exit, for example:

NOTE: If the socket path given is for a directory, a socket name underneath that path will be generated based on crosvm's PID.

$ crosvm run -s /run/crosvm.sock ${USUAL_CROSVM_ARGS}
    <in another shell>
$ crosvm stop /run/crosvm.sock

WARNING: The guest OS will not be notified or gracefully shutdown.

This will cause the original crosvm process to exit in an orderly fashion, allowing it to clean up any OS resources that might have stuck around if crosvm were terminated early.

Multiprocess Mode

By default crosvm runs in multiprocess mode. Each device that supports running inside of a sandbox will run in a jailed child process of crosvm. The appropriate minijail seccomp policy files must be present either in /usr/share/policy/crosvm or in the path specified by the --seccomp-policy-dir argument. The sandbox can be disabled for testing with the --disable-sandbox option.

Virtio Wayland

Virtio Wayland support requires special support on the part of the guest and as such is unlikely to work out of the box unless you are using a Chrome OS kernel along with a termina rootfs.

To use it, ensure that the XDG_RUNTIME_DIR enviroment variable is set and that the path $XDG_RUNTIME_DIR/wayland-0 points to the socket of the Wayland compositor you would like the guest to use.

Defaults

The following are crosvm's default arguments and how to override them.

  • 256MB of memory (set with -m)
  • 1 virtual CPU (set with -c)
  • no block devices (set with -r, -d, or --rwdisk)
  • no network (set with --host_ip, --netmask, and --mac)
  • virtio wayland support if XDG_RUNTIME_DIR enviroment variable is set (disable with --no-wl)
  • only the kernel arguments necessary to run with the supported devices (add more with -p)
  • run in multiprocess mode (run in single process mode with --disable-sandbox)
  • no control socket (set with -s)

System Requirements

A Linux kernel with KVM support (check for /dev/kvm) is required to run crosvm. In order to run certain devices, there are additional system requirements:

  • virtio-wayland - The memfd_create syscall, introduced in Linux 3.17, and a Wayland compositor.
  • vsock - Host Linux kernel with vhost-vsock support, introduced in Linux 4.8.
  • multiprocess - Host Linux kernel with seccomp-bpf and Linux namespacing support.
  • virtio-net - Host Linux kernel with TUN/TAP support (check for /dev/net/tun) and running with CAP_NET_ADMIN privileges.

Emulated Devices

Device Description
CMOS/RTC Used to get the current calendar time.
i8042 Used by the guest kernel to exit crosvm.
serial x86 I/O port driven serial devices that print to stdout and take input from stdin.
virtio-block Basic read/write block device.
virtio-net Device to interface the host and guest networks.
virtio-rng Entropy source used to seed guest OS's entropy pool.
virtio-vsock Enabled VSOCKs for the guests.
virtio-wayland Allowed guest to use host Wayland socket.

Contributing

Code Health

build_test

There are no automated tests run before code is committed to crosvm. In order to maintain sanity, please execute build_test before submitting code for review. All tests should be passing or ignored and there should be no compiler warnings or errors. All supported architectures are built, but only tests for x86_64 are run. In order to build everything without failures, sysroots must be supplied for each architecture. See build_test -h for more information.

rustfmt

All code should be formatted with rustfmt. We have a script that applies rustfmt to all Rust code in the crosvm repo: please run bin/fmt before checking in a change. This is different from cargo fmt --all which formats multiple crates but a single workspace only; crosvm consists of multiple workspaces.

Dependencies

With a few exceptions, external dependencies inside of the Cargo.toml files are not allowed. The reason being that community made crates tend to explode the binary size by including dozens of transitive dependencies. All these dependencies also must be reviewed to ensure their suitability to the crosvm project. Currently allowed crates are:

  • byteorder - A very small library used for endian swaps.
  • cc - Build time dependency needed to build C source code used in crosvm.
  • libc - Required to use the standard library, this crate is a simple wrapper around libc's symbols.

Code Overview

The crosvm source code is written in Rust and C. To build, crosvm generally requires the most recent stable version of rustc.

Source code is organized into crates, each with their own unit tests. These crates are:

  • crosvm - The top-level binary front-end for using crosvm.
  • devices - Virtual devices exposed to the guest OS.
  • io_jail - Creates jailed process using libminijail.
  • kernel_loader - Loads elf64 kernel files to a slice of memory.
  • kvm_sys - Low-level (mostly) auto-generated structures and constants for using KVM.
  • kvm - Unsafe, low-level wrapper code for using kvm_sys.
  • net_sys - Low-level (mostly) auto-generated structures and constants for creating TUN/TAP devices.
  • net_util - Wrapper for creating TUN/TAP devices.
  • sys_util - Mostly safe wrappers for small system facilities such as eventfd or syslog.
  • syscall_defines - Lists of syscall numbers in each architecture used to make syscalls not supported in libc.
  • vhost - Wrappers for creating vhost based devices.
  • virtio_sys - Low-level (mostly) auto-generated structures and constants for interfacing with kernel vhost support.
  • vm_control - IPC for the VM.
  • x86_64 - Support code specific to 64 bit intel machines.

The seccomp folder contains minijail seccomp policy files for each sandboxed device. Because some syscalls vary by architecture, the seccomp policies are split by architecture.